Isoprenoid production

ABSTRACT

The invention provides methods and materials related to the production of isoprenoids. Specifically, the invention provides isolated nucleic acids, substantially pure polypeptides, host cells, and methods and materials for producing various isoprenoid compounds.

BACKGROUND

[0001] 1. Technical Field

[0002] The invention relates to methods and materials involved in the production of isoprenoids.

[0003] 2. Background Information

[0004] Isoprenoids are compounds that have at least one five-carbon isoprenoid unit. Examples of isoprenoid compounds include, without limitation, carotenoids, isoprenes, sterols, terpenes, and ubiquinones. Various enzymatic pathways in plants, animals, and microorganisms result in the synthesis of isoprenoid compounds. Typically, isopentenyl diphosphate (IPP), dimethylallyl diphosphate (DMAPP), or combinations thereof are polymerized to form isoprenoid compounds.

[0005] Two pathways can be used to produce IPP. The first pathway, known as the mevalonate-dependent pathway, produces IPP from 3-hydroxymethyl-3-methylglutaryl Coenzyme A (HMGCoA) in a series of reactions. The second pathway, known as the mevalonate-independent pathway, produces IPP from 1-deoxyxylulose-5-phosphate (DXP) in a series of reactions. One of those reactions involves the use of DXP synthase (DXS) to catalyze the condensation of pyruvate and glyceraldehyde-3-phosphate to form DXP.

[0006] Once made, IPP can be used to make various isoprenoid compounds. Specifically, enzymes known as polyprenyl diphosphate synthases catalyze polymerization reactions that combine IPP and DMAPP to form compounds known as polyprenyl diphosphates. For example, decaprenyl diphosphate synthase (DDS) catalyzes the consecutive condensation of IPP with allylic diphosphates to produce decaprenyl diphosphate. Decaprenyl diphosphate is a polyprenyl diphosphate that can be used to form the side chain of a ubiquinone known as CoQ(10). Other polyprenyl diphosphate syntheses include, without limitation, farnesyl-, geranyl-, and octapreneyl diphosphate synthases.

SUMMARY

[0007] The invention relates to methods and materials involved in the production of isoprenoid compounds. Specifically, the invention provides nucleic acid molecules, polypeptides, host cells, and methods that can be used to produce isoprenoid compounds. Isoprenoid compounds are both biologically and commercially important. For example, the nutritional industry uses isoprenoid compounds as nutritional supplements, while the perfume industry uses isoprenoid compounds as fragrances. The nucleic acid molecules described herein can be used to engineer host cells having the ability to produce particular isoprenoid compounds. The polypeptides described herein can be used in cell-free systems to make particular isoprenoid compounds. The host cells described herein can be used in culture systems to produce large quantities of particular isoprenoid compounds.

[0008] In general, the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO: 1 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (3626, 100), point B has coordinates (3626, 65), point C has coordinates (50, 65), and point D has coordinates (12, 100). The point B can have coordinates (3626, 85). The point C can have coordinates (100, 65). The point C can have coordinates (50, 85). The point D can have coordinates (15, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DXS activity. The nucleic acid sequence can be as set forth in SEQ ID NO: 1.

[0009] In one embodiment, the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:2 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1926, 100), point B has coordinates (1926, 65), point C has coordinates (50, 65), and point D has coordinates (12, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DXS activity.

[0010] In another embodiment, the invention features an isolated nucleic acid containing a nucleic acid sequence, wherein the nucleic acid sequence encodes a polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:3 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (641, 100), point B has coordinates (641, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DXS activity.

[0011] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:37 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1990, 100), point B has coordinates (1990, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The point B can have coordinates (1990, 85). The point C can have coordinates (100, 55). The point C can have coordinates (50, 85). The point D can have coordinates (20, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DDS activity. The nucleic acid sequence can be as set forth in SEQ ID NO:37.

[0012] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:38 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1002, 100), point B has coordinates (1002, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DDS activity.

[0013] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence, wherein the nucleic acid sequence encodes a polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:39 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (333, 100), point B has coordinates (333, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DDS activity.

[0014] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:40 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1833, 100), point B has coordinates (1833, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The point B can have coordinates (1833, 85). The point C can have coordinates (100, 65). The point C can have coordinates (50, 85). The point D can have coordinates (20, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DDS activity. The nucleic acid sequence can be as set forth in SEQ ID NO:40.

[0015] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:41 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1014, 100), point B has coordinates (1014, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DDS activity.

[0016] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence, wherein the nucleic acid sequence encodes a polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:42 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (337, 100), point B has coordinates (337, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DDS activity.

[0017] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:95 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (2017, 100), point B has coordinates (2017, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The point B can have coordinates (2017, 85). The point C can have coordinates (100, 65). The point C can have coordinates (50, 85). The point D can have coordinates (20, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DXR activity. The nucleic acid sequence can be as set forth in SEQ ID NO:95.

[0018] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:96 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1161, 100), point B has coordinates (1161, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DXR activity.

[0019] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence, wherein the nucleic acid sequence encodes a polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:97 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (386, 100), point B has coordinates (386, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DXR activity.

[0020] Another embodiment of the invention features an isolated nucleic acid containing a nucleic acid sequence of at least 12 nucleotides, wherein the isolated nucleic acid hybridizes under hybridization conditions to the sense or antisense strand of a nucleic acid molecule, the sequence of the nucleic acid molecule being the sequence set forth in SEQ ID NO: 1, 2, 37, 38, 40, 41, 95, or 96. The nucleic acid sequence can be at least 50 nucleotides (e.g., at least 100, 200, 300, 400, 500, or more). The nucleic acid sequence can encode a polypeptide. The polypeptide can have DXS, DDS, or DXR activity.

[0021] In another aspect, the invention features a substantially pure polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:3 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (641, 100), point B has coordinates (641, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DXS activity.

[0022] In another embodiment, the invention features a substantially pure polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:39 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (333, 100), point B has coordinates (333, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DDS activity.

[0023] Another embodiment of the invention features a substantially pure polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:42 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (337, 100), point B has coordinates (337, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DDS activity.

[0024] Another embodiment of the invention features a substantially pure polypeptide containing an amino acid sequence, wherein the amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:97 over the length, wherein the point defined by the length and the percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (386, 100), point B has coordinates (386, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100). The polypeptide can have DXR activity.

[0025] Another aspect of the invention features a host cell containing an isolated nucleic acid of claim 1, 9, 12, 14, 22, 25, 27, 35, 38, 40, 48, 51, or 53. The host cell can be prokaryotic. The host cell can be a Rhodobacter, Sphingomonas, or Escherichia cell. The host cell can contain an exogenous nucleic acid that encodes a polypeptide having DDS, DXS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase, 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, or chorismate lyase activity. The host cell can contain an exogenous nucleic acid containing an UbiC sequence or LytB sequence. The host cell can contain an exogenous nucleic acid containing an UbiC sequence and LytB sequence. The host cell can contain a non-functional crtE sequence, ppsR sequence, or ccoN sequence. The host cell can contain a non-functional crtE sequence, ppsR sequence, and ccoN sequence.

[0026] Another embodiment of the invention features a host cell containing an exogenous nucleic acid and a non-functional crtE sequence, ppsR sequence, or ccoN sequence, wherein the exogenous nucleic acid is within a crtE, ppsR, or ccoN locus of the host cell.

[0027] Another embodiment of the invention features a host cell containing a genomic deletion, wherein the deletion comprises at least a portion of a crtE sequence, ppsR sequence, or ccoN sequence, and wherein the host cell comprises a non-functional crtE sequence, ppsR sequence, or ccoN sequence.

[0028] Another aspect of the invention features a method for increasing production of CoQ(10) in a cell having endogenous DDS activity. The method includes inserting a nucleic acid molecule containing a nucleic acid sequence that encodes a polypeptide having DDS activity into the cell such that production of CoQ(10) is increased. The nucleic acid molecule can contain an isolated nucleic acid of claim 14, 22, 25, 27, 35, 38, or 53. The production of CoQ(10) can be increased at least about 5 percent as compared to a control cell lacking the inserted nucleic acid molecule. The cell can be a Rhodobacter or Sphingomonas cell. The cell can be a membraneous bacterium or highly membraneous bacterium. The method can also include inserting a second nucleic acid molecule containing a nucleotide sequence that encodes a polypeptide having DXS activity into the cell. The second nucleic acid molecule can contain an isolated nucleic acid of claim 1, 9, or 12.

[0029] In another embodiment, the invention features a method for increasing production of CoQ(10) in a cell having endogenous DDS activity. The method includes inserting a nucleic acid molecule containing a nucleic acid sequence that encodes a polypeptide having DXS activity into the cell such that production of CoQ(10) is increased. The production of CoQ(10) can be increased at least about 5 percent as compared to a control cell lacking the inserted nucleic acid molecule. The cell can be a Rhodobacter or Sphingomonas cell. The nucleic acid molecule can contain an isolated nucleic acid of claim 1, 9, or 12. The cell can be a membraneous bacterium or highly membraneous bacterium. The method can also include inserting a second nucleic acid molecule containing a nucleotide sequence that encodes a polypeptide having DDS activity into the cell. The second nucleic acid molecule can contain an isolated nucleic acid of claim 14, 22, 25, 27, 35, 38, or 53.

[0030] Another embodiment of the invention features a method for increasing production of CoQ(10) in a membraneous bacterium. The method includes inserting a nucleic acid molecule containing a nucleic acid sequence that encodes a polypeptide having DDS activity into the bacterium such that production of CoQ(10) is increased.

[0031] Another embodiment of the invention features a method for increasing production of CoQ(10) in a highly membraneous bacterium. The method includes inserting a nucleic acid molecule containing a nucleic acid sequence that encodes a polypeptide having DDS activity into the highly membraneous bacterium such that production of CoQ(10) is increased.

[0032] Another embodiment of the invention features a method for making an isoprenoid. The method includes culturing a cell under conditions wherein the cell produces the isoprenoid, wherein the cell contains at least one exogenous nucleic acid that encodes at least one polypeptide, wherein the cell produces more of the isoprenoid than a comparable cell lacking the at least one exogenous nucleic acid. The cell can be a Rhodobacter or Sphingomonas cell. The isoprenoid can be CoQ(10). The at least one polypeptide can have DDS, DXS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase, 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, or chorismate lyase activity. The at least one polypeptide can be a UbiC polypeptide or a LytB polypeptide. The cell can contain a non-functional crtE sequence, ppsR sequence, or ccoN sequence. The cell can contain a non-functional crtE sequence, ppsR sequence, and ccoN sequence. The cell can contain a genomic deletion, wherein the deletion contains at least a portion of a crtE sequence, ppsR sequence, or ccoN sequence, and wherein the cell contains a non-functional crtE sequence, ppsR sequence, or ccoN sequence.

[0033] Another embodiment of the invention features a method for making an isoprenoid. The method includes culturing a genetically modified cell under conditions wherein the cell produces the isoprenoid. The isoprenoid can be CoQ(10). The cell can contain an exogenous nucleic acid. The cell can contain a genomic deletion.

[0034] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

[0035] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

DESCRIPTION OF DRAWINGS

[0036]FIG. 1 is a diagram of a pathway for producing CoQ(10).

[0037]FIG. 2 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi (ATCC 12417) polypeptide having DXS activity (SEQ ID NO: 1). The start codon is the ATG at nucleotide number 182, and the stop codon is the TAA at nucleotide number 2107. The probable ribosome binding site is at nucleotide numbers 175-178. This sequence contains an open reading frame as well as 5′ and 3′ untranslated sequences.

[0038]FIG. 3 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi (ATCC 12417) polypeptide having DXS activity (SEQ ID NO:2). This sequence corresponds to the open reading frame.

[0039]FIG. 4 is a listing of an amino acid sequence of a Sphingomonas trueperi (ATCC 12417) polypeptide having DXS activity (SEQ ID NO:3).

[0040]FIG. 5 is a sequence pile-up of 14 nucleic acid sequences that encode polypeptides having DXS activity. STdxsdna represents the nucleic acid sequence set forth in SEQ ID NO:2; CRdxsdna represents a nucleic acid sequence from Chlamydomonas reinhardtii (GenBank accession number AJ007559; SEQ ID NO:4); CJdxsdna represents a nucleic acid sequence from Campylobacter jejuni (GenBank accession number AL139074; SEQ ID NO:5); PAdxsdna represents a nucleic acid sequence from Pseudomonas aeruginosa (GenBank accession number AE004821; SEQ ID NO:6); LEdxsdna represents a nucleic acid sequence from Lycopersicon esculentum (GenBank accession number AF143812; SEQ ID NO:7); MTdxsdna represents a nucleic acid sequence from Mycobacterium tuberculosis (GenBank accession number Z96072; SEQ ID NO:8); RSdxs1dna represents a nucleic acid sequence from a Rhodobacter sphaeroides dxs1 gene (SEQ ID NO:9); RSdxs2dna represents a nucleic acid sequence from a Rhodobacter sphaeroides dxs2 gene (SEQ ID NO: 10); SPCCdxsdna represents a nucleic acid sequence from Synechococcus PCC6301 (GenBank accession number Y18874; SEQ ID NO:11); ECdxsdna represents a nucleic acid sequence from Escherichia coli (GenBank accession number AF035440; SEQ ID NO: 12); NMdxsdna represents a nucleic acid sequence from Neisseria meningitidis (GenBank accession number AL162753; SEQ ID NO: 13); HIdxsdna represents a nucleic acid sequence from Haemophilus influenza (GenBank accession number U32822; SEQ ID NO:14); SSdxsdna represents a nucleic acid sequence from Streptomyces sp. CL190 (GenBank(accession number AB026631; SEQ ID NO:16); and HPdxsdna represents a nucleic acid sequence from Helicobacter pylori 26695 (GenBank accession number AE000552; SEQ ID NO: 17).

[0041]FIG. 6 is a sequence pile-up of 21 amino acid sequences of polypeptides having DXS activity. STdxsp represents an amino acid sequence set forth in SEQ ID NO:3; AAdxsp represents an amino acid sequence from Aquifex aeolicus (GenBank accession number O67036; SEQ ID NO:18); BSdxsp represents an amino acid sequence from Bacillus subtilis (GenBank accession number P54523; SEQ ID NO: 19); CRdxsp represents an amino acid sequence from Chlamydomonas reinhardtii (GenBank accession number CAA07554; SEQ ID NO:20); CJdxsp represents an amino acid sequence from Campylobacter jejuni (GenBank accession number CAB72788; SEQ ID NO:21); PAdxsp represents an amino acid sequence from Pseudomonas aeruginosa (GenBank accession number AAG07431; SEQ ID NO: 15); LEdxsp represents an amino acid sequence from Lycopersicon esculentum (GenBank accession number AAD38941; SEQ ID NO:22); MLdxsp represents an amino acid sequence from Mycobacterium leprae (GenBank accession number Q50000; SEQ ID NO:23); MTdxsp represents an amino acid sequence from Mycobacterium tuberculosis (GenBank accession number CAB09493; SEQ ID NO:24); RCdxsp represents an amino acid sequence from Rhodobacter capsulatus (GenBank accession number P26242; SEQ ID NO:25); RSdxs1p represents an amino acid sequence encoded by a Rhodobacter sphaeoides dxs1 gene (SEQ ID NO:26); RSdxs2p represents an amino acid sequence encoded by a Rhodobacter sphaeroides dxs2 gene (SEQ ID NO:27); SPCCdxsp represents an amino acid sequence from Synechococcus PCC6301 (GenBank accession number CAB60078; SEQ ID NO:28); SPdxsp represents an amino acid sequence from Synechocystis PCC6803 (GenBank accession number P73067; SEQ ID NO:29); TMdxsp represents an amino acid sequence from Thermotoga maritima (GenBank accession number Q9X291; SEQ ID NO:30); ECdxsp represents an amino acid sequence from Escherichia coli (GenBank accession number D64771; SEQ ID NO:31); NMdxsp represents an amino acid sequence from Neisseria meningitidis (GenBank accession number CAB83880; SEQ ID NO:32); Hldxsp represents an amino acid sequence from Haemophilus influenza (GenBank accession number B64172; SEQ ID NO:33); PFdxsp represents an amino acid sequence from Plasmodium falciparum (GenBank accession number AAD03740; SEQ ID NO:34); SSdxsp represents an amino acid sequence from Streptomyces sp. CL 190 (GenBank accession number BAA85847; SEQ ID NO:35); and HPdxsp represents an amino acid sequence from Helicobacter pylori 26695 (GenBank accession number AAD07422; SEQ ID NO:36).

[0042]FIG. 7 is a listing of a nucleic acid sequence that encodes a Rhodobacter sphaeroides (ATCC 17023) polypeptide having DDS activity (SEQ ID NO:37). The start codon is the ATG at nucleotide number 372, and the stop codon is the TGA at nucleotide number 1373. The probable ribosome binding site is at nucleotide numbers 363-366. This sequence contains an open reading frame as well as 5′ and 3′ untranslated sequences.

[0043]FIG. 8 is a listing of a nucleic acid sequence that encodes a Rhodobacter sphaeroides (ATCC 17023) polypeptide having DDS activity (SEQ ID NO:38). This sequence corresponds to the open reading frame.

[0044]FIG. 9 is a listing of an amino acid sequence of a Rhodobacter sphaeroides (ATCC 17023) polypeptide having DDS activity (SEQ ID NO:39).

[0045]FIG. 10 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi (ATCC 12417) polypeptide having DDS activity (SEQ ID NO:40). The start codon is the ATG at nucleotide number 605, and the stop codon is the TGA at nucleotide number 1618. The probable ribosome binding site is at nucleotide numbers 590-594. This sequence contains an open reading frame as well as 5′ and 3′ untranslated sequences.

[0046]FIG. 11 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi (ATCC 12417) polypeptide having DDS activity (SEQ ID NO:41). This sequence corresponds to the open reading frame.

[0047]FIG. 12 is a listing of an amino acid sequence of a Sphingomonas trueperi (ATCC 12417) polypeptide having DDS activity (SEQ ID NO:42). This sequence corresponds to the open reading frame.

[0048]FIG. 13 is a sequence pile-up of five nucleic acid sequences that encode polypeptides having DDS activity. RSddsdna represents the nucleic acid sequence set forth in SEQ ID NO:38; STddsdna represents the nucleic acid sequence set forth in SEQ ID NO:41; SPddsdna represents a nucleic acid sequence from Schizosaccharomyces pombe (GenBank accession number D84311; SEQ ID NO:43); GSddsdna represents a nucleic acid sequence from Gluconobacter suboxydans (GenBank accession number AB006850; SEQ ID NO:44); and RCddsdna represents a nucleic acid sequence from Rhodobacter capsulatus (U.S. Pat. No. 6,103,488; SEQ ID NO:45).

[0049]FIG. 14 is a sequence pile-up of five amino acid sequences of polypeptides having DDS activity. RSddsp represents the amino acid sequence set forth in SEQ ID NO:39; STddsp represents the amino acid sequence set forth in SEQ ID NO:42; GSddsp represents an amino acid sequence from Gluconobacter suboxydans (GenBank accession number BAA32241; SEQ ID NO:46); SPddsp represents an amino acid sequence from Schizosaccharomyces pombe (GenBank accession number CAB66154; SEQ ID NO:47); and RCddsp represents an amino acid sequence from Rhodobacter capsulatus (U.S. Pat. No. 6,103,488; SEQ ID NO:48).

[0050]FIG. 15 is a sequence pile-up of three amino acid sequences of polypeptides having DXS activity. Hpdxsp represents the amino acid sequence set forth in SEQ ID NO:36; Ecdxsp represents the amino acid sequence set forth in SEQ ID NO:31; and Hidxsp represents the amino acid sequence set forth in SEQ ID NO:33.

[0051]FIG. 16 is a sequence pile-up of four amino acid sequences of polypeptides having DDS, ODS (octaprenyl diphosphate synthase), or SDS (solanesyl diphosphate synthase) activity. Rcsdsp represents an amino acid sequence from Rhodobacter capsulatus having SDS activity (SEQ ID NO:49); Rpodsp represents an amino acid sequence from Rickettsia prowazeki having ODS activity (SEQ ID NO:50); Gsddsp represents the amino acid sequence set forth in SEQ ID NO:46; and Ecodsp represents an amino acid sequence from Escherichia coli ispB having ODS activity (SEQ ID NO:51).

[0052]FIG. 17 is a sequence pile-up of five amino acid sequences of polypeptides having DDS, ODS, or SDS activity. Rpodsp represents the amino acid sequence set forth in SEQ ID NO:50; Gsddsp represents the amino acid sequence set forth in SEQ ID NO:46; Ecodsp represents the amino acid sequence set forth in SEQ ID NO:51; Hiodsp represents an amino acid sequence from Haemophilus influenze having ODS activity (SEQ ID NO:52); and Rcsdsp represents the amino acid sequence set forth in SEQ ID NO:49.

[0053]FIG. 18 is a diagram of a construct designated appUC18-SHDXS.

[0054]FIG. 19 is a diagram of a construct designated appUC18-RSdds.

[0055]FIG. 20 is a diagram of a construct designated appUC18-SHDDS.

[0056]FIG. 21 is a mass chromatogram obtained from a MG1655 PUC18 specimen.

[0057]FIG. 22 is a mass chromatogram obtained from a MG1655 PUC18-DDS specimen.

[0058]FIG. 23 is a mass spectra obtained from a MG1655 PUC18 specimen.

[0059]FIG. 24 is a mass spectra obtained from a MG1655 PUC18-DDS specimen.

[0060]FIG. 25 is a mass spectra obtained from a MG1655 PUC18-DDS specimen.

[0061]FIG. 26 is a graph plotting length and percent identity with points A, B, C, and D defining an area indicated by shading.

[0062]FIG. 27 is a sequence pile-up of seven amino acid sequences of polypeptides having DXR activity. Bsdxrp represents an amino acid sequence from Bacillus subtilis (SEQ ID NO:98); Hmdxrp represents an amino acid sequence from Haemophilus influenzae (SEQ ID NO:99); Ecdxrp represents an amino acid sequence from Escherishia coli (SEQ ID NO: 100); Zmdxrp represents an amino acid sequence from Zymonas mobilis (SEQ ID NO:101); Sldxrp represents an amino acid sequence from Synechococcus leopoliensis (SEQ ID NO: 102); Ssdxrp represents an amino acid sequence from Synechocystis sp. PCC6803 (SEQ ID NO: 103); and Mtdxrp represents an amino acid sequence from Mycobacterium tuberculosis (SEQ ID NO: 104).

[0063]FIG. 28 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi polypeptide having DXR activity (SEQ ID NO:95). The start codon is the GTG at either nucleotide number 575 or 578, and the stop codon is the TGA at nucleotide number 1733. This sequence contains an open reading frame as well as 5′ and 3′ untranslated sequences.

[0064]FIG. 29 is a listing of a nucleic acid sequence that encodes a Sphingomonas trueperi polypeptide having DXR activity (SEQ ID NO:96). This sequence corresponds to the open reading frame.

[0065]FIG. 30 is a listing of an amino acid sequence of a Sphingomonas trueperi polypeptide having DXR activity (SEQ ID NO:97).

[0066]FIG. 31 is a sequence pile-up of twelve nucleic acid sequences that encode polypeptides having DXR activity. Stdxrcds represents the nucleic acid sequence set forth in SEQ ID NO:96; Padxrd represents a nucleic acid sequence from Pseudomonas aeruginosa (SEQ ID NO: 105); Zmdxrd represents a nucleic acid sequence from Zygomonas mobilis (SEQ ID NO: 106); Sgdxrd represents a nucleic acid sequence from Streptomyces griseolosporeus (SEQ ID NO: 107); Nmdxrd represents a nucleic acid sequence from Neisseria meningitidis (SEQ ID NO: 108); Ecdxrd represents a nucleic acid sequence from Escherishia coli (SEQ ID NO: 109); Sldxrd represents a nucleic acid sequence from Synechococcus leopoliensis (SEQ ID NO: 110); Mldxrd represents a nucleic acid sequence from Mycobacterium leprae (SEQ ID NO: 111); Pmdxrd represents a nucleic acid sequence from Pasteurella multocida (SEQ ID NO:112); Atdxrd represents a nucleic acid sequence from Arabidopsis thaliana (SEQ ID NO: 113); Cjdxrd represents a nucleic acid sequence from Campylobacter jejuni (SEQ ID NO:114); and Pfdxrd represents a nucleic acid sequence from Plasmodium falciparum (SEQ ID NO: 115).

[0067]FIG. 32 is a sequence pile-up of sixteen amino acid sequences of polypeptides having DXR activity. Stdxlp represents the amino acid sequence set forth in SEQ ID NO:97; Zmdxrp represents an amino acid sequence from Zymononas mobilis (SEQ ID NO: 116); Padxrp represents an amino acid sequence from Pseudomonas aeruginosa (SEQ ID NO:117); Ecdxrp represents an amino acid sequence from Escherishia coli (SEQ ID NO:118); Nmdxrp represents an amino acid sequence from Neisseria meningitidis (SEQ ID NO: 119); Hidxrp represents an amino acid sequence from Haemophilus influenzae (SEQ ID NO: 120); Ssdxrp represents an amino acid sequence from Synechocystis sp. PCC6803 (SEQ ID NO:121); Pmdxrp represents an amino acid sequence from Pasteurella multocida (SEQ ID NO:122); Sldxrp represents an amino acid sequence from Synechococcus leopoliensis (SEQ ID NO: 123); Sgdxrp represents an amino acid sequence from Streptomyces griseolosporeus (SEQ ID NO: 124); Bsdxrp represents an amino acid sequence from Bacillus subtilis (SEQ ID NO: 125); Mldxrp represents an amino acid sequence from Mycobacterium leprae (SEQ ID NO: 126); Mtdxrp represents an amino acid sequence from Mycobacterium tuberculosis (SEQ ID NO: 127); Atdxrp represents an amino acid sequence from Arabidopsis thaliana (SEQ ID NO:128); Cjdxrp represents an amino acid sequence from Campylobacter jejuni (SEQ ID NO: 130); and Pfdxrp represents an amino acid sequence from Plasmodium falciparum (SEQ ID NO:131).

DETAILED DESCRIPTION

[0068] The invention provides methods and materials related to the production of isoprenoids. Specifically, the invention provides isolated nucleic acids, substantially pure polypeptides, host cells, and methods and materials for producing various isoprenoid compounds. For the purpose of this invention, an isoprenoid compound is any compound containing a five-carbon isoprenoid unit. Examples of isoprenoid compounds include, without limitation, carotenoids, isoprenes, sterols, terpenes, and ubiquinones. Such isoprenoid compounds can be used in a wide range of applications. For example, isoprenoid compounds produced as described herein can be used in industrial, pharmaceutical, or cosmetic products.

[0069] In general terms, carotenoids are lipophilic pigments typically found in photosynthetic plants and bacteria. Examples of carotenoids include, without limitation, carotenes, xanthophylls, hydrocarbon carotenoids, hydroxy carotenoid derivatives, epoxy carotenoid derivatives, furanoxy carotenoid derivatives, and oxy carotenoid derivatives. Isoprenes are oily hydrocarbons that can be obtained by distilling caoutchouc or guttaipercha. Examples of isoprenes include, without limitation, rubber, vitamin A, and vitamin K. Sterols are steroid-based alcohols typically having a hydrocarbon side-chain of eight to ten carbon atoms at the 17-beta position and a hydroxyl group at the 3-beta position. Examples of sterols include, without limitation, ergosterol, cholesterol, and stigmasterol. Terpenes are lipid species typically found in plants in great abundance. Examples of terpenes include, without limitation, dolichol, squalene, and limonene. Ubiquinones are 2,3-dimethoxy-5-methylbenzoquinone derivatives having a side chain containing at least one isoprenoid unit. Typically, ubiquinone is referred to as Coenzyme Q (CoQ). In addition, the number of isoprenoid units of a side chain of a particular ubiquinone is used to identify that particular ubiquinone. For example, a ubiquinone with six isoprenoid units is referred to as CoQ(6), while a ubiquinone with ten isoprenoid units is referred to as CoQ(10). It is noted that CoQ(10) also is referred to as ubidecarenone. Examples of ubiquinones include, without limitation, CoQ(6), CoQ(8), CoQ(10), and CoQ(12).

[0070] Isoprenoid compounds can be pyruvate-derived products. The term “pyruvate-derived product” as used herein refers to any compound that is synthesized from pyruvate within no more than 25 enzymatic steps. Thus, an isoprenoid compound is not a pyruvate-derived product if that isoprenoid compound is synthesized from pyruvate in more than 25 enzymatic steps. An enzymatic step is a single chemical reaction catalyzed by a polypeptide having enzymatic activity. The term “polypeptide having enzymatic activity” as used herein refers to any polypeptide that catalyzes a chemical reaction of other substances without itself being destroyed or altered upon completion of the reaction. Typically, a polypeptide having enzymatic activity catalyzes the formation of one or more products from one or more substrates. Such polypeptides can have any type of enzymatic activity including, without limitation, the enzymatic activity associated with an enzyme such as DXS, DDS, ODS, SDS, DXR (1-deoxy-D-xylulose 5-phosphate reductoisomerase), ispD (4-diphosphocytidyl-2C-methyl-D-erythritol synthase), and ispE (4-diphosphocytidyl-2C-methyl-D-erythritol kinase).

[0071] A polypeptide having a particular enzymatic activity can be a polypeptide that is either naturally-occurring or non-naturally-occurring. A naturally-occurring polypeptide is any polypeptide having an amino acid sequence as found in nature, including wild-type and polymorphic polypeptides. Such naturally-occurring polypeptides can be obtained from any species including, without limitation, animal (e.g., mammalian), plant, fungal, and bacterial species. A non-naturally-occurring polypeptide is any polypeptide having an amino acid sequence that is not found in nature. Thus, a non-naturally-occurring polypeptide can be a mutated version of a naturally-occurring polypeptide, or an engineered polypeptide. For example, a non-naturally-occurring polypeptide having DDS activity can be a mutated version of a naturally-occurring polypeptide having DDS activity that retains at least some DDS activity. A polypeptide can be mutated by, for example, sequence additions, deletions, substitutions, or combinations thereof.

[0072] Examples of isoprenoid compounds that are pyruvate-derived products include, without limitation, CoQ(6), CoQ(7), CoQ(S), CoQ(9), CoQ(10), astaxanthin, canthaxanthin, lutein, zeaxanthin, beta-carotene, lycopene, capsanthin, bixin, norbixin, crocetin, zeta-carotene, vitamin E, giberellins, abscisic acid, ergosterol, geraniol, and latex.

[0073] As depicted in FIG. 1, multiple polypeptide can be used to convert glucose CoQ(10). For example, polypeptides having DXS, DXR, LytB, and DDS activity can be used to convert glucose CoQ(10). Such polypeptides can be obtained and used to make CoQ(10) as described herein.

[0074] 1. Nucleic Acids

[0075] The term “nucleic acid” as used herein encompasses both RNA and DNA, including cDNA, genomic DNA, and synthetic (e.g., chemically synthesized) DNA. The nucleic acid can be double-stranded or single-stranded. Where single-stranded, the nucleic acid can be the sense strand or the antisense strand. In addition, nucleic acid can be circular or linear.

[0076] The term “isolated” as used herein with reference to nucleic acid refers to a naturally-occurring nucleic acid that is not immediately contiguous with both of the sequences with which it is immediately contiguous (one on the 5′ end and one on the 3′ end) in the naturally-occurring genome of the organism from which it is derived. For example, an isolated nucleic acid can be, without limitation, a recombinant DNA molecule of any length, provided one of the nucleic acid sequences normally found immediately flanking that recombinant DNA molecule in a naturally-occurring genome is removed or absent. Thus, an isolated nucleic acid includes, without limitation, a recombinant DNA that exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences as well as recombinant DNA that is incorporated into a vector, an autonomously replicating plasmid, a virus (e.g., a retrovirus, adenovirus, or herpes virus), or into the genomic DNA of a prokaryote or eukaryote. In addition, an isolated nucleic acid can include a recombinant DNA molecule that is part of a hybrid or fusion nucleic acid sequence.

[0077] The term “isolated” as used herein with reference to nucleic acid also includes any non-naturally-occurring nucleic acid since non-naturally-occurring nucleic acid sequences are not found in nature and do not have immediately contiguous sequences in a naturally-occurring genome. For example, non-naturally-occurring nucleic acid such as an engineered nucleic acid is considered to be isolated nucleic acid. Engineered nucleic acid can be made using common molecular cloning or chemical nucleic acid synthesis techniques. Isolated non-naturally-occurring nucleic acid can be independent of other sequences, or incorporated into a vector, an autonomously replicating plasmid, a virus (e.g., a retrovirus, adenovirus, or herpes virus), or the genomic DNA of a prokaryote or eukaryote. In addition, a non-naturally-occurring nucleic acid can include a nucleic acid molecule that is part of a hybrid or fusion nucleic acid sequence.

[0078] It will be apparent to those of skill in the art that a nucleic acid existing among hundreds to millions of other nucleic acid molecules within, for example, cDNA or genomic libraries, or gel slices containing a genomic DNA restriction digest is not to be considered an isolated nucleic acid.

[0079] The term “exogenous” as used herein with reference to nucleic acid and a particular cell refers to any nucleic acid that does not originate from that particular cell as found in nature. Thus, all non-naturally-occurring nucleic acid is considered to be exogenous to a cell once introduced into the cell. It is important to note that non-naturally-occurring nucleic acid can contain nucleic acid sequences or fragments of nucleic acid sequences that are found in nature provided the nucleic acid as a whole does not exist in nature. For example, a nucleic acid molecule containing a genomic DNA sequence within an expression vector is non-naturally-occurring nucleic acid, and thus is exogenous to a cell once introduced into the cell, since that nucleic acid molecule as a whole (genomic DNA plus vector DNA) does not exist in nature. Thus, any vector, autonomously replicating plasmid, or virus (e.g., retrovirus, adenovirus, or herpes virus) that as a whole does not exist in nature is considered to be non-naturally-occurring nucleic acid. It follows that genomic DNA fragments produced by PCR or restriction endonuclease treatment as well as cDNAs are considered to be non-naturally-occulting nucleic acid since they exist as separate molecules not found in nature. It also follows that any nucleic acid containing a promoter sequence and polypeptide-encoding sequence (e.g., cDNA or genomic DNA) in an arrangement not found in nature is non-naturally-occurring nucleic acid.

[0080] Nucleic acid that is naturally-occurring can be exogenous to a particular cell. For example, an entire chromosome isolated from a cell of person X is an exogenous nucleic acid with respect to a cell of person Y once that chromosome is introduced into Y's cell.

[0081] The invention provides isolated nucleic acid that contains a nucleic acid sequence having (1) a length, and (2) a percent identity to an identified nucleic acid sequence over that length. The invention also provides isolated nucleic acid that contains a nucleic acid sequence encoding a polypeptide that contains an amino acid sequence having (1) a length, and (2) a percent identity to an identified amino acid sequence over that length. Typically, the identified nucleic acid or amino acid sequence is a sequence referenced by a particular sequence identification number, and the nucleic acid or amino acid sequence being compared to the identified sequence is referred to as the target sequence. For example, an identified sequence can be the sequence set forth in SEQ ID NO: 1.

[0082] A length and percent identity over that length for any nucleic acid or amino acid sequence is determined as follows. First, a nucleic acid or amino acid sequence is compared to the identified nucleic acid or amino acid sequence using the BLAST 2 Sequences (Bl2seq) program from the stand-alone version of BLASTZ containing BLASTN version 2.0.14 and BLASTP version 2.0.14. This stand-alone version of BLASTZ can be obtained from the University of Wisconsin library as well as at www.fr.com or www.ncbi.nlm.nih.gov. Instructions explaining how to use the B12seq program can be found in the readme file accompanying BLASTZ. B12seq performs a comparison between two sequences using either the BLASTN or BLASTP algorithm. BLASTN is used to compare nucleic acid sequences, while BLASTP is used to compare amino acid sequences. To compare two nucleic acid sequences, the options are set as follows: -i is set to a file containing the first nucleic acid sequence to be compared (e.g., C:\seq1.txt); -j is set to a file containing the second nucleic acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastn; -o is set to any desired file name (e.g., C:\output.txt); -q is set to −1; -r is set to 2; and all other options are left at their default setting. For example, the following command can be used to generate an output file containing a comparison between two sequences: C:\Bl2seq -i c:\seq1 .txt -j c:\seq2.txt -p blastn -o c:\output.txt -q -1-r 2. To compare two amino acid sequences, the options of Bl2seq are set as follows: -i is set to a file containing the first amino acid sequence to be compared (e.g., C:\seq1.txt); -j is set to a file containing the second amino acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set to any desired file name (e.g., C:\output.txt); and all other options are left at their default setting. For example, the following command can be used to generate an output file containing a comparison between two amino acid sequences: C:\Bl2seq -i c:\seq1.txt j c:\seq2.txt -p blastp -o c:\output.txt. If the target sequence shares homology with any portion of the identified sequence, then the designated output file will present those regions of homology as aligned sequences. If the target sequence does not share homology with any portion of the identified sequence, then the designated output file will not present aligned sequences. Once aligned, a length is determined by counting the number of consecutive nucleotides or amino acid residues from the target sequence presented in alignment with sequence fi-om the identified sequence starting with any matched position and ending with any other matched position. A matched position is any position where an identical nucleotide or amino acid residue is presented in both the target and identified sequence. Gaps presented in the target sequence are not counted since gaps are not nucleotides or amino acid residues. Likewise, gaps presented in the identified sequence are not counted since target sequence nucleotides or amino acid residues are counted, not nucleotides or amino acid residues from the identified sequence.

[0083] The percent identity over a determined length is determined by counting the number of matched positions over that length and dividing that number by the length followed by multiplying the resulting value by 100. For example, if (1) a 1000 nucleotide target sequence is compared to the sequence set forth in SEQ ID NO: 1, (2) the Bl2seq program presents 200 nucleotides from the target sequence aligned with a region of the sequence set forth in SEQ ID NO: 1 where the first and last nucleotides of that 200 nucleotide region are matches, and (3) the number of matches over those 200 aligned nucleotides is 180, then the 1000 nucleotide target sequence contains a length of 200 and a percent identity over that length of 90 (i.e. 180÷200*100=90).

[0084] It will be appreciated that a single nucleic acid or amino acid target sequence that aligns with an identified sequence can have many different lengths with each length having its own percent identity. For example, a target sequence containing a 20 nucleotide region that aligns with an identified sequence as follows has many different lengths including those listed in Table 1. 1                  20 Target Sequence: AGGTCGTGTACTGTCAGTCA | || ||| |||| |||| | Identified Sequence: ACGTGGTGAACTGCCAGTGA

[0085] TABLE I Starting Ending Matched Percent Position Position Length Positions Identity 1 20 20 15 75.0 1 18 18 14 77.8 1 15 15 11 73.3 6 20 15 12 80.0 6 17 12 10 83.3 6 15 10 8 80.0 8 20 13 10 76.9 8 16 9 7 77.8

[0086] It is noted that the percent identity value is rounded to the nearest tenth. For example, 78.11, 78.12, 78.13, and 78.14 is rounded down to 78.1, while 78.15, 78.16, 78.17, 78.18, and 78.19 is rounded up to 78.2. It is also noted that the length value will always be an integer.

[0087] The invention provides an isolated nucleic acid containing a nucleic acid sequence that has at least one length and percent identity over that length as determined above such that the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26. In addition, the invention provides an isolated nucleic acid containing a nucleic acid sequence that encodes a polypeptide containing an amino acid sequence that has at least one length and percent identity over that length as determined above such that the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26. The point defined by a length and percent identity over that length is that point on the X/Y coordinate of FIG. 26 where the X axis is the length and the Y axis is the percent identity. Thus, the point defined by a nucleic acid sequence with a length of 200 and a percent identity of 90 has coordinates (200, 90). For the purpose of this invention, any point that falls on point A, B, C, or D is considered within the area defined by points A, B, C, and D of FIG. 26. Likewise, any point that falls on a line that defines the area defined by points A, B, C, and D is considered within the area defined by points A, B, C, and D of FIG. 26.

[0088] It will be appreciated that the term “the area defined by points A, B, C, and D of FIG. 26” as used herein refers to that area defined by the lines that connect point A with point B, point B with point C, point C with point D, and point D with point A. Points A, B, C, and D can define an area having any shape defined by four points (e.g., square, rectangle, or rhombus). In addition, two or more points can have the same coordinates. For example, points B and C can have identical coordinates. In this case, the area defined by points A, B, C, and D of FIG. 26 is triangular. If three points have identical coordinates, then the area defined by points A, B, C, and D of FIG. 26 is a line. In this case, any point that falls on that line would be considered within the area defined by points A, B, C, and D of FIG. 26. If all four points have identical coordinates, then the area defined by points A, B, C, and D of FIG. 26 is a point. In all cases, simple algebraic equations can be used to determine whether a point is within the area defined by points A, B, C, and D of FIG. 26.

[0089] It is noted that FIG. 26 is a graphical representation presenting possible positions of points A, B, C, and D. The shaded area illustrated in FIG. 26 represents one possible example, while the arrows indicate that other positions for points A, B, C, and D are possible. In fact, points A, B, C, and D can have any X coordinate and any Y coordinate. For example, point A can have an X coordinate equal to the number of nucleotides or amino acid residues in an identified sequence, and a Y coordinate of 100. Point B can have an X coordinate equal to the number of nucleotides or amino acid residues in an identified sequence, and a Y coordinate less than or equal to 100 (e.g., 50, 55, 65, 70, 75, 80, 85, 90, 95, and 99). Point C call have an X coordinate equal to a percent (e.g., 1, 2, 5, 10, 15, or more percent) of the number of nucleotides or amino acid residues in an identified sequence, and a Y coordinate less than or equal to 100 (e.g., 50, 55, 65, 70, 75, 80, 85, 90, 95, and 99). Point D can have an X coordinate equal to the length of a typical PCR primer (e.g., 12, 13, 14, 15, 16, 17, or more) or antigenic polypeptide (e.g., 5, 6, 7, 8, 9, 10, 11, 12, or more), and a Y coordinate less than or equal to 100 (e.g., 50, 55, 65, 70, 75, 80, 85, 90, 95, and 99).

[0090] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO: 1 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 3626, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 3626, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 3626, 3600, 3500, 3000, 2500, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 3626, 3600, 3500, 3000, 2500, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (3626, 100), point B can be (3626, 95), point C can be (1900, 95), and point D can be (1900, 100).

[0091] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:2 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1926, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1926, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1926, 1900, 1850, 1800, 1750, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1926, 1900, 1850, 1800, 1750, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1926, 100), point B can be (1926, 95), point C can be (1000, 95), and point D can be (1000, 100).

[0092] An isolated nucleic acid containing a nucleic acid sequence that encodes a polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:3 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 641, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 641, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 641, 635, 630, 625, 620, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 641, 635, 630, 625, 620, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (641, 100), point B can be (641, 95), point C can be (400, 95), and point D can be (400, 100).

[0093] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:37 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1990, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1990, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1990, 1950, 1900, 1850, 1800, 1750, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1990, 1950, 1900, 1850, 1800, 1750, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1990, 100), point B can be (1990, 95), point C can be (1000, 95), and point D can be (1000, 100).

[0094] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:38 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1002, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1002, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1002, 950, 900, 850, 800, 750, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1002, 950, 900, 850, 800, 750, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1002, 100), point B can be (1002, 95), point C can be (500, 95), and point D can be (500, 100).

[0095] An isolated nucleic acid containing a nucleic acid sequence that encodes a polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:39 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 333, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 333, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 333, 330, 325, 320, 315, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 333, 330, 325, 320, 315, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (333, 100), point B can be (333, 95), point C can be (150, 95), and point D can be (150, 100).

[0096] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:40 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1833, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1833, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1833, 1800, 1750, 1700, 1650, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1833, 1800, 1750, 1700, 1650, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C cal be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1833, 100), point B can be (1833, 95), point C can be (900, 95), and point D can be (900, 100).

[0097] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:41 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1014, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1014, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1014, 950, 900, 800, 700, 600, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1014, 950, 900, 800, 700, 600, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1014, 100), point B can be (1014, 95), point C can be (500, 95), and point D can be (500, 100).

[0098] An isolated nucleic acid containing a nucleic acid sequence that encodes a polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:42 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 337, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 337, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 337, 335, 330, 325, 320, 315, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 337, 335, 330, 325, 320, 315, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (337, 100), point B can be (337, 95), point C can be (150, 95), and point D can be (150, 100).

[0099] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:95 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 2017, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 2017, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 2017, 2000, 1900, 1950, 1800, 1700, 1600, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 2017, 2000, 1900, 1950, 1800, 1700, 1600, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, 500, 1000, 1500, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, 250, 500, 1000, 1500, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (2017, 100), point B can be (2017, 95), point C can be (1800, 95), and point D can be (1800, 100).

[0100] An isolated nucleic acid containing a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:96 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 1161, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 1161, and a Y coordinate greater than or equal to 65; where point C has an X coordinate greater than or equal to 50, and a Y coordinate greater than or equal to 65; and where point D has an X coordinate greater than or equal to 12, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 1161, 1050, 1000, 950, 900, 800, 700, 600, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 1161, 1050, 1000, 950, 900, 800, 700, 600, or less; and the Y coordinate for point B can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 50, 60, 70, 80, 90, 100, 150, 200, 250, 500, 1000, or more; and the Y coordinate for point C can be 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 75, 100, 250, 500, 1000, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (1161, 100), point B can be (1161, 95), point C can be (1000, 95), and point D can be (1000, 100).

[0101] An isolated nucleic acid containing a nucleic acid sequence that encodes a polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:97 over that length is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 386, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 0.386, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 386, 380, 375, 370, 375, 360, 365, 350, 325, 300, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 386, 380, 375, 370, 375, 360, 365, 350, 325, 300, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, 350, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, 200, 300, 350, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (386, 100), point B can be (386, 95), point C can be (350, 95), and point D can be (350, 100).

[0102] The invention also provides isolated nucleic acid that is at least about 12 bases in length (e.g., at least about 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, 60, 100, 250, 500, 750, 1000, 1500, 2000, 3000, 4000, or 5000 bases in length) and hybridizes, under hybridization conditions, to the sense or antisense strand of a nucleic acid having the sequence set forth in SEQ ID NO:1, 2, 37, 38, 40, 41, 95, or 96. The hybridization conditions can be moderately or highly stringent hybridization conditions.

[0103] For the purpose of this invention, moderately stringent hybridization conditions mean the hybridization is performed at about 42° C. in a hybridization solution containing 25 mM KPO₄ (pH 7.4), 5×SSC, 5×Denhart's solution, 50 μg/mL denatured, sonicated salmon sperm DNA, 50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5×10⁷ cpm/μg), while the washes are performed at about 50° C. with a wash solution containing 2×SSC and 0.1% sodium dodecyl sulfate.

[0104] Highly stringent hybridization conditions mean the hybridization is performed at about 42° C. in a hybridization solution containing 25 mM KPO₄ (pH 7.4), 5×SSC, 5×Denhart's solution, 50 μg/mL denatured, sonicated salmon sperm DNA, 50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5×10⁷ cpm/μg), while the washes are performed at about 65° C. with a wash solution containing 0.2×SSC and 0.1% sodium dodecyl sulfate.

[0105] Isolated nucleic acid within the scope of the invention can be obtained using any method including, without limitation, common molecular cloning and chemical nucleic acid synthesis techniques. For example, PCR can be used to obtain an isolated nucleic acid containing a nucleic acid sequence sharing similarity to the sequence set forth in SEQ ID NO:1, 2, 37, 38, 40, 41, 95, or 96. PCR refers to a procedure or technique in which target nucleic acid is amplified in a manner similar to that described in U.S. Pat. No. 4,683,195, and subsequent modifications of the procedure described therein. Generally, sequence information from the ends of the region of interest or beyond are used to design oligonucleotide primers that are identical or similar in sequence to opposite strands of a potential template to be amplified. Using PCR, a nucleic acid sequence can be amplified from RNA or DNA. For example, a nucleic acid sequence can be isolated by PCR amplification from total cellular RNA, total genomic DNA, and cDNA as well as from bacteriophage sequences, plasmid sequences, viral sequences, and the like. When using RNA as a source of template, reverse transcriptase can be used to synthesize complimentary DNA strands.

[0106] An isolated nucleic acid within the scope of the invention also can be obtained by mutagenesis. For example, an isolated nucleic acid containing a sequence set forth in SEQ ID NO:1, 2, 37, 38, 40, 41, 95, or 96 can be mutated using common molecular cloning techniques (e.g., site-directed mutagenesis). Possible mutations include, without limitation, deletions, insertions, and substitutions, as well as combinations of deletions, insertions, and substitutions.

[0107] In addition, nucleic acid and amino acid databases (e.g., GenBank®) can be used to obtain an isolated nucleic acid within the scope of the invention. For example, any nucleic acid sequence having some homology to a sequence set forth in SEQ ID NO: 1, 2, 37, 38, 40, 41, 95, or 96, or any amino acid sequence having some homology to a sequence set forth in SEQ ID NO:3, 39, 42, or 97 can be used as a query to search GenBank®.

[0108] Further, nucleic acid hybridization techniques can be used to obtain an isolated nucleic acid within the scope of the invention. Briefly, any nucleic acid having some homology to a sequence set forth in SEQ ID NO: 1, 2, 37, 38, 40, 41, 95, or 96 can be used as a probe to identify a similar nucleic acid by hybridization under conditions of moderate to high stringency. Once identified, the nucleic acid then can be purified, sequenced, and analyzed to determine whether it is within the scope of the invention as described herein.

[0109] Hybridization can be done by Southern or Northern analysis to identify a DNA or RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with a biotin, digoxygenin, an enzyme, or a radioisotope such as ³²P. The DNA or RNA to be analyzed can be electrophoretically separated on an agarose or polyacrylamide gel, transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the probe using standard techniques well known in the art such as those described in sections 7.39-7.52 of Sambrook et al., (1989) Molecular Cloning, second edition, Cold Spring harbor Laboratory, Plainview, N. Y. Typically, a probe is at least about 20 nucleotides in length. For example, a probe corresponding to a 20 nucleotide sequence set forth in SEQ ID NO:1, 2, 37, 38, 40, 41, 95, or 96 can be used to identify an identical or similar nucleic acid. In addition, probes longer or shorter than 20 nucleotides can be used.

[0110] The invention provides isolated nucleic acid that contains the entire nucleic acid sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29. In addition, the invention provides isolated nucleic acid that contains a portion of the nucleic acid sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29. For example, the invention provides isolated nucleic acid that contains a 15 nucleotide sequence identical to any 15 nucleotide sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29 including, without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide number 15, the sequence starting at nucleotide number 2 and ending at nucleotide number 16, the sequence starting at nucleotide number 3 and ending at nucleotide number 17, and so forth. It will be appreciated that the invention also provides isolated nucleic acid that contains a nucleotide sequence that is greater than 15 nucleotides (e.g., 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more nucleotides) in length and identical to any portion of the sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29. For example, the invention provides isolated nucleic acid that contains a 25 nucleotide sequence identical to any 25 nucleotide sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29 including, without limitation, the sequence starting at nucleotide number 1 and ending at nucleotide variations. For example, the STdxsdna sequence can contain one variation provided in FIG. 5 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 5. It is noted that the full-length nucleic acid sequences depicted in FIG. 5 can encode polypeptides having DXS activity. It also is noted that the nucleic acid sequence depicted in FIG. 2 contains the nucleic acid sequence depicted in FIG. 3.

[0111]FIG. 13 depicts the nucleic acid sequence depicted in FIG. 8 (designated RSddsdna) and the nucleic acid sequence depicted in FIG. 11 (designated STddsdna) aligned with each other as well as aligned with three other nucleic acid sequences. Examples of variations of the RSddsdna sequence include, without limitation, any variation of the RSddsdna sequence provided in FIG. 13. Examples of variations of the STddsdna sequence include, without limitation, any variation of the STddsdna sequence provided in FIG. 13. Such variations are provided in FIG. 13 in that a comparison of the nucleotide (or lack thereof) at a particular position of the RSddsdna sequence or the STddsdna sequence with the nucleotide (or lack thereof) at the same position of any of the other nucleic acid sequences depicted in FIG. 13 provides a list of specific changes for the RSddsdna sequence and the STddsdna sequence. For example, the “a” at position 511 of the RSddsdna sequence or the “a” at position 756 of the STddsdna sequence can be substituted with an “t” as indicated in FIG. 13. Again, it will be appreciated that the RSddsdna sequence as well as the STddsdna sequence can contain any number of variations as well as any combination of types of variations. For example, the RSddsdna sequence can contain one variation provided in FIG. 13 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 13. Likewise, the STddsdna sequence can contain one variation provided in FIG. 13 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 13. It is noted that the full-length nucleic acid sequences depicted in FIG. 13 can encode polypeptides having DDS activity. It also is noted that the nucleic acid sequence depicted in FIG. 7 contains the nucleic acid sequence depicted in FIG. 8 and that the nucleic acid sequence depicted in FIG. 10 contains the nucleic acid sequence depicted in FIG. 11.

[0112] The nucleic acid sequence depicted in FIG. 7 contains a nucleic acid sequence that encodes a R. sphaeroides (ATCC 17023) polypeptide having DDS activity. Another variant of this nucleic acid sequence is the nucleic acid sequence of a clone isolated from R. sphaeroides (ATCC 35053). Briefly, a R. sphaeroides (ATCC 35053) clone was identified and found to contain a sequence identical to the nucleic acid sequence depicted in FIG. 7 with the following three exceptions. The R. sphaeroides (ATCC 35053) clone has a “t” at position 885 rather than a “c”, a “c” inserted after the “c” at position 1620, and a “c” inserted after the “c” at position 1733.

[0113] The nucleic acid depicted in FIG. 8 also contains a nucleic acid sequence that encodes a R. sphaeroides (ATCC 17023) polypeptide having DDS activity. Another variant of this nucleic acid sequence is the nucleic acid sequence of a clone isolated from R. sphaeroides (ATCC 35053). Briefly, a R. sphaeroides (ATCC 35053) clone was identified and found to contain a sequence identical to the nucleic acid sequence depicted in FIG. 8 with the following exception. The R. sphaeroides (ATCC 35053) clone has a “t” at position 514 rather than a “c”.

[0114]FIG. 31 depicts the nucleic acid sequence depicted in FIG. 29 (designated Stdxrcds) aligned with eleven other nucleic acid sequences. Examples of variations of the Stdxrcds sequence include, without limitation, any variation of the Stdxrcds sequence provided in FIG. 31. Such variations are provided in FIG. 31 in that a comparison of the nucleotide (or lack thereof) at a particular position of the Stdxrcds sequence with the nucleotide (or lack thereof) at the same position of any of the other nucleic acid sequences depicted in FIG. 31 provides a list of specific changes for the Stdxrcds sequence. Again, it will be appreciated that the Stdxrcds sequence can contain any number of variations as well as any combination of types of variations. For example, the Stdxrcds sequence can contain one variation provided in FIG. 31 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 31. It is noted that the full-length nucleic acid sequences depicted in FIG. 31 can encode polypeptides having DXR activity. It also is noted that the nucleic acid sequence depicted in FIG. 29 contains the nucleic acid sequence depicted in FIG. 28.

[0115] The invention also provides isolated nucleic acid that contains a variant of a portion of the nucleic acid sequence depicted in FIG. 2, 3, 7, 8, 10, 11, 28, or 29 as described herein.

[0116] The invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes the entire amino acid sequence depicted in FIG. 4, 9, 12, or 30. In addition, the invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes a portion of the amino acid sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes a 15 amino acid sequence identical to any 15 amino acid sequence depicted in FIG. 4, 9, 12, or 30 including, without limitation, the sequence starting at amino acid residue number 1 and ending at amino acid residue number 15, the sequence starting at amino acid residue number 2 and ending at amino acid residue number 16, the sequence starting at amino acid residue number 3 and ending at amino acid residue number 17, and so forth. It will be appreciated that the invention also provides isolated nucleic acid that contains a nucleic acid sequence that encodes an amino acid sequence that is greater than 15 amino acid residues (e.g., 16, 17, 1S, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more amino acid residues) in length and identical to any portion of the sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes a 25 amino acid sequence identical to any 25 amino acid sequence depicted in FIG. 4, 9, 12, or 30 including, without limitation, the sequence starting at amino acid residue number 1 and ending at amino acid residue number 25, the sequence starting at amino acid residue number 2 and ending at amino acid residue number 26, the sequence starting at amino acid residue number 3 and ending at amino acid residue number 27, and so forth. Additional examples include, without limitation, isolated nucleic acids that contain a nucleic acid sequence that encodes an amino acid sequence that is 50 or more amino acid residues (e.g., 100, 150, 200, 250, 300, 350, or more amino acid residues) in length and identical to any portion of the sequence depicted in FIG. 4, 9, 12, or 30. Such isolated nucleic acids can include, without limitation, those isolated nucleic acids containing a nucleic acid sequence that encodes an amino acid sequence represented in a single line of sequence depicted in FIG. 4, 9, 12, or 30 since each line of sequence depicted in these figures, with the exception of the last line, provides a 50 amino acid sequence.

[0117] In addition, the invention provides isolated nucleic acid that contains a nucleic acid sequence that encodes an amino acid sequence having a variation of the amino acid sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides isolated nucleic acid containing a nucleic acid sequence encoding an amino acid sequence depicted in FIG. 4, 9, 12, or 30 that contains a single insertion, a single deletion, a single substitution, multiple insertions, multiple deletions, multiple substitutions, or any combination thereof (e.g., single deletion together with multiple insertions). The invention provides multiple examples of isolated nucleic acid containing a nucleic acid sequence encoding an amino acid sequence having a variation of an amino acid sequence depicted in FIG. 4, 9, 12, or 30.

[0118]FIG. 6 depicts the amino acid sequence depicted in FIG. 4 (designated STdxsp) aligned with 20 other amino acid sequences. Examples of variations of the STdxsp sequence include, without limitation, any variation of the STdxsp sequence provided in FIG. 6. Such variations are provided in FIG. 6 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the STdxsp sequence with the amino acid residue (or lack thereof) at the same position of any of the other 20 amino acid sequences depicted in FIG. 6 provides a list of specific changes for the STdxsp sequence. For example, the “t” at position 1148 of the STdxsp sequence can be substituted with an “s” as indicated in FIG. 6. As also indicated in FIG. 6, the “f” at position 575 of the STdxsp sequence can be substituted with an “m”, “a”, “l”, “i”, “y”, or “v”. For FIG. 6, the nucleic acid numbering of FIG. 2 is used to number the amino acid residue positions of the STdxsp sequence. Thus, the first amino acid residue of the STdxsp sequence starts with number 182 and proceeds in increments of three. It will be appreciated that the STdxsp sequence can contain any number of variations as well as any combination of types of variations. For example, the STdxsp sequence can contain one variation provided in FIG. 6 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 6. It is noted that the 21 full-length amino acid sequences depicted in FIG. 6 can be polypeptides having DXS activity.

[0119]FIG. 14 depicts the amino acid sequence depicted in FIG. 9 (designated RSddsp) and the amino acid sequence depicted in FIG. 12 (designated STddsp) aligned with each other as well as aligned with three other amino acid sequences. For FIG. 14, the nucleic acid numbering of FIG. 7 is used to number the amino acid residue positions of the RSddsp sequence, and the nucleic acid numbering of FIG. 10 is used to number the amino acid residue positions of the STddsp sequence. Thus, the first amino acid residue of the RSddsp and STddsp sequences each start with a number other than 1 and proceed in increments of three. Examples of variations of the RSddsp sequence include, without limitation, any variation of the RSddsp sequence provided in FIG. 14. Examples of variations of the STddsp sequence include, without limitation, any variation of the STddsp sequence provided in FIG. 14. Such variations are provided in FIG. 14 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the RSddsp sequence or the STddsp sequence with the amino acid residue (or lack thereof) at the same position of any of the other amino acid sequences depicted in FIG. 14 provides a list of specific changes for the RSddsp sequence and the STddsp sequence. For example, the “1” at position 762 of the RSddsp sequence or the “1” at position 1007 of the STddsp sequence can be substituted with an “a” as indicated in FIG. 14. Again, it will be appreciated that the RSddsp sequence as well as the STddsp sequence can contain any number of variations as well as any combination of types of variations. For example, the RSddsp sequence can contain one variation provided in FIG. 14 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 14. Likewise, the STddsp sequence can contain one variation provided in FIG. 14 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 14. It is noted that the five full-length amino acid sequences depicted in FIG. 14 can be polypeptides having DDS activity.

[0120] The amino acid sequence depicted in FIG. 9 represents a R. sphaeroides (ATCC 17023) polypeptide having DDS activity. Another variant of this amino acid sequence is the amino acid sequence encoded by a clone isolated from R. sphaeroides (ATCC 35053). Briefly, a R. sphaeroides (ATCC 35053) clone was identified and found to encode an amino acid sequence identical to the amino acid sequence depicted in FIG. 9 with the following exception. The R. sphaeroides (ATCC 35053) clone has a “y” at position 172 rather than an “h”.

[0121]FIG. 32 depicts the amino acid sequence depicted in FIG. 30 (designated Stdxrp) aligned with 15 other amino acid sequences. Examples of variations of the Stdxrp sequence include, without limitation, any variation of the Stdxrp sequence provided in FIG. 32. Such variations are provided in FIG. 32 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the Stdxrp sequence with the amino acid residue (or lack thereof) at the same position of any of the other 15 amino acid sequences depicted in FIG. 32 provides a list of specific changes for the Stdxrp sequence. It will be appreciated that the Stdxrp sequence can contain any number of variations as well as any combination of types of variations. For example, the Stdxrp sequence can contain one variation provided in FIG. 32 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 32. It is noted that the full-length amino acid sequences depicted in FIG. 32 can be polypeptides having DXR activity.

[0122] The invention also provides isolated nucleic acid containing a nucleic acid sequence encoding an amino acid sequence that contains a variant of a portion of the amino acid sequence depicted in FIG. 4, 9, 12, or 30 as described herein.

[0123] 2. Polypeptides

[0124] The invention provides substantially pure polypeptides. The term “substantially pure” as used herein with reference to a polypeptide means the polypeptide is substantially free of other polypeptides, lipids, carbohydrates, and nucleic acid with which it is naturally associated. Thus, a substantially pure polypeptide is any polypeptide that is removed from its natural environment and is at least 60 percent pure. A substantially pure polypeptide can be at least about 65, 70, 75, 80, 85, 90, 95, or 99 percent pure. Typically, a substantially pure polypeptide will yield a single major band on a non-reducing polyacrylamide gel.

[0125] Any substantially pure polypeptide having an amino acid sequence encoded by a nucleic acid within the scope of the invention is itself within the scope of the invention. In addition, any substantially pure polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:3 over that length as determined herein is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 641, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 641, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 641, 635, 630, 625, 620, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 641, 635, 630, 625, 620, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (641, 100), point B can be (641, 95), point C can be (400, 95), and point D can be (400, 100).

[0126] Any substantially pure polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:39 over that length as determined herein is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 333, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 333, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 333, 330, 325, 320, 315, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 333, 330, 325, 320, 315, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (333, 100), point B can be (333, 95), point C can be (150, 95), and point D can be (150, 100).

[0127] Any substantially pure polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:42 over that length as determined herein is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 337, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 337, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 337, 335, 330, 325, 320, 315, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 337, 335, 330, 325, 320, 315, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (337, 100), point B can be (337, 95), point C can be (150, 95), and point D can be (150, 100).

[0128] Any substantially pure polypeptide containing an amino acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:97 over that length as determined herein is within the scope of the invention provided the point defined by that length and percent identity is within the area defined by points A, B, C, and D of FIG. 26; where point A has an X coordinate less than or equal to 386, and a Y coordinate less than or equal to 100; where point B has an X coordinate less than or equal to 386, and a Y coordinate greater than or equal to 50; where point C has an X coordinate greater than or equal to 25, and a Y coordinate greater than or equal to 50; and where point D has an X coordinate greater than or equal to 5, and a Y coordinate less than or equal to 100. For example, the X coordinate for point A can be 386, 380, 375, 370, 375, 360, 365, 350, 325, 300, or less; and the Y coordinate for point A can be 100, 99, 95, 90, 85, 80, 75, or less. The X coordinate for point B can be 386, 380, 375, 370, 375, 360, 365, 350, 325, 300, or less; and the Y coordinate for point B can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point C can be 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, 350, or more; and the Y coordinate for point C can be 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 99 or more. The X coordinate for point D can be 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100, 200, 300, 350, or more; and the Y coordinate for point D can be 100, 99, 95, 90, 85, 80, 75, or less. In one embodiment, point A can be (386, 100), point B can be (386, 95), point C can be (350, 95), and point D can be (350, 100).

[0129] Any method can be used to obtain a substantially pure polypeptide. For example, common polypeptide purification techniques such as affinity chromotography and HPLC as well as polypeptide synthesis techniques can be used. In addition, any material can be used as a source to obtain a substantially pure polypeptide. For example, tissue from wild-type or transgenic animals can be used as a source material. In addition, tissue culture cells engineered to over-express a particular polypeptide of interest can be used to obtain substantially pure polypeptide. Further, a polypeptide within the scope of the invention can be “engineered” to contain an amino acid sequence that allows the polypeptide to be captured onto an affinity matrix. For example, a tag such as c-myc, hemagglutinin, polyhistidine, or Flag™ tag (Kodak) can be used to aid polypeptide purification. Such tags can be inserted anywhere within the polypeptide including at either the carboxyl or amino termini. Other fusions that could be useful include enzymes that aid in the detection of the polypeptide, such as alkaline phosphatase.

[0130] The invention provides polypeptides that contain the entire amino acid sequence depicted in FIG. 4, 9, 12, or 30. In addition, the invention provides polypeptides that contain a portion of the amino acid sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides polypeptides that contain a 15 amino acid sequence identical to any 15 amino acid sequence depicted in FIG. 4, 9, 12, or 30 including, without limitation, the sequence starting at amino acid residue number 1 and ending at amino acid residue number 15, the sequence starting at amino acid residue number 2 and ending at amino acid residue number 16, the sequence starting at amino acid residue number 3 and ending at amino acid residue number 17, and so forth. It will be appreciated that the invention also provides polypeptides that contain an amino acid sequence that is greater than 15 amino acid residues (e.g., 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more amino acid residues) in length and identical to any portion of the sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides polypeptides that contain a 25 amino acid sequence identical to any 25 amino acid sequence depicted in FIG. 4, 9, 12, or 30 including, without limitation, the sequence starting at amino acid residue number 1 and ending at amino acid residue number 25, the sequence starting at amino acid residue number 2 and ending at amino acid residue number 26, the sequence starting at amino acid residue number 3 and ending at amino acid residue number 27, and so forth. Additional examples include, without limitation, polypeptides that contain an amino acid sequence that is 50 or more amino acid residues (e.g., 100, 150, 200, 250, 300, 350, or more amino acid residues) in length and identical to any portion of the sequence depicted in FIG. 4, 9, 12, or 30. Such polypeptides can include, without limitation, those polypeptides containing a amino acid sequence represented in a single line of sequence depicted in FIG. 4, 9, 12, or 30 since each line of sequence depicted in these figures, with the possible exception of the last line, provides a 50 amino acid sequence.

[0131] In addition, the invention provides polypeptides that an amino acid sequence having a variation of the amino acid sequence depicted in FIG. 4, 9, 12, or 30. For example, the invention provides polypeptides containing an amino acid sequence depicted in FIG. 4, 9, 12, or 30 that contains a single insertion, a single deletion, a single substitution, multiple insertions, multiple deletions, multiple substitutions, or any combination thereof (e.g., single deletion together with multiple insertions). The invention provides multiple examples of polypeptides containing an amino acid sequence having a variation of an amino acid sequence depicted in FIG. 4, 9, 12, or 30.

[0132]FIG. 6 depicts the amino acid sequence depicted in FIG. 4 (designated STdxsp) aligned with 20 other amino acid sequences. Examples of variations of the STdxsp sequence include, without limitation, any variation of the STdxsp sequence provided in FIG. 6. Such variations are provided in FIG. 6 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the STdxsp sequence with the amino acid residue (or lack thereof) at the same position of any of the other 20 amino acid sequences depicted in FIG. 6 provides a list of specific changes for the STdxsp sequence. For example, the “t” at position 1148 of the STdxsp sequence can be substituted with an “s” as indicated in FIG. 6. As also indicated in FIG. 6, the “f” at position 575 of the STdxsp sequence can be substituted with an “m”, “a”, “l”, “i”, “y”, or “v”. For FIG. 6, the nucleic acid numbering of FIG. 2 is used to number the amino acid residue positions of the STdxsp sequence. Thus, the first amino acid residue of the STdxsp sequence starts with number 182 and proceeds in increments of three. It will be appreciated that the STdxsp sequence can contain any number of variations as well as any combination of types of variations. For example, the STdxsp sequence can contain one variation provided in FIG. 6 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 6. It is noted that the 21 full-length amino acid sequences depicted in FIG. 6 can be polypeptides having DXS activity.

[0133]FIG. 14 depicts the amino acid sequence depicted in FIG. 9 (designated RSddsp) and the amino acid sequence depicted in FIG. 12 (designated STddsp) aligned with each other as well as aligned with three other amino acid sequences. For FIG. 14, the nucleic acid numbering of FIG. 7 is used to number the amino acid residue positions of the RSddsp sequence, and the nucleic acid numbering of FIG. 10 is used to number the amino acid residue positions of the STddsp sequence. Thus, the first amino acid residue of the RSddsp and STddsp sequences each start with a number other than 1 and proceed in increments of three. Examples of variations of the RSddsp sequence include, without limitation, any variation of the RSddsp sequence provided in FIG. 14. Examples of variations of the STddsp sequence include, without limitation, any variation of the STddsp sequence provided in FIG. 14. Such variations are provided in FIG. 14 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the RSddsp sequence or the STddsp sequence with the amino acid residue (or lack thereof) at the same position of any of the other amino acid sequences depicted in FIG. 14 provides a list of specific changes for the RSddsp sequence and the STddsp sequence. For example, the “l” at position 762 of the RSddsp sequence or the “l” at position 1007 of the STddsp sequence can be substituted with an “a” as indicated in FIG. 14. Again, it will be appreciated that the RSddsp sequence as well as the STddsp sequence can contain any number of variations as well as any combination of types of variations. For example, the RSddsp sequence can contain one variation provided in FIG. 14 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 14. Likewise, the STddsp sequence can contain one variation provided in FIG. 14 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 14. It is noted that the five full-length amino acid sequences depicted in FIG. 14 can be polypeptides having DDS activity.

[0134]FIG. 32 depicts the amino acid sequence depicted in FIG. 30 (designated Stdxrp) aligned with 15 other amino acid sequences. Examples of variations of the Stdxrp sequence include, without limitation, any variation of the Stdxrp sequence provided in FIG. 32. Such variations are provided in FIG. 32 in that a comparison of the amino acid residue (or lack thereof) at a particular position of the Stdxrp sequence with the amino acid residue (or lack thereof) at the same position of any of the other 15 amino acid sequences depicted in FIG. 32 provides a list of specific changes for the Stdxrp sequence. It will be appreciated that the Stdxrp sequence can contain any number of variations as well as any combination of types of variations. For example, the Stdxrp sequence can contain one variation provided in FIG. 32 or more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or more) of the variations provided in FIG. 32. It is noted that the full-length amino acid sequences depicted in FIG. 32 can be polypeptides having DXR activity.

[0135] The invention also provides polypeptides containing an amino acid sequence that contains a variant of a portion of the amino acid sequence depicted in FIG. 4, 9, 12, or 30 as described herein.

[0136] 3. Genetically Modified Cells

[0137] Any cell containing an isolated nucleic acid within the scope of the invention is itself within the scope of the invention. This includes, without limitation, prokaryotic cells such as cells from the Rhodospirillaceae family (e.g., Rhodobacter cells) and eukaryotic cells such as plant and mammalian cells. It is noted that cells containing an isolated nucleic acid of the invention are not required to express the isolated nucleic acid. In addition, the isolated nucleic acid can be integrated into the genome of the cell or maintained in an episomal state. In other words, cells can be stably or transiently transformed with an isolated nucleic acid of the invention.

[0138] Any method can be used to introduce an isolated nucleic acid into a cell. In fact, many methods for introducing nucleic acid into a cell, whether iii vivo or in vitro, are well known to those skilled in the art. For example, calcium phosphate precipitation, electroporation, heat shock, lipofection, microinjection, conjugation, and viral-mediated nucleic acid transfer are common methods that can be used to introduce nucleic acid into a cell. In addition, naked DNA can be delivered directly to cells in vivo as describe elsewhere (U.S. Pat. No. 5,580,859 and U.S. Pat. No. 5,589,466 including continuations thereof). Further, nucleic acid can be introduced into cells by generating transgenic animals.

[0139] Any method can be used to identify cells that contain an isolated nucleic acid within the scope of the invention. For example, PCR and nucleic acid hybridization techniques such as Northern and Southern analysis can be used. In some cases, immunohistochemistry and biochemical techniques can be used to determine if a cell contains a particular nucleic acid by detecting the expression of a polypeptide encoded by that particular nucleic acid. For example, detection of polypeptide X-immunoreactivity after introduction of an isolated nucleic acid containing a cDNA that encodes polypeptide X into a cell that does not normally express polypeptide X can indicate that that cell not only contains the introduced nucleic acid but also expresses the encoded polypeptide X from that introduced nucleic acid. In this case, the detection of any enzymatic activities of polypeptide X also can indicate that that cell contains the introduced nucleic acid and expresses the encoded polypeptide X from that introduced nucleic acid.

[0140] Any method can be used to direct the expression of an amino acid sequence from a nucleic acid. Such methods are well known to those skilled in the art, and include, without limitation, constructing a nucleic acid such that a regulatory element drives the expression of a nucleic acid sequence that encodes a polypeptide. Typically, regulatory elements are DNA sequences that regulate the expression of other DNA sequences at the level of transcription. Such regulatory elements include, without limitation, promoters, enhancers, and the like. In addition, any method for expressing a polypeptide from an exogenous nucleic acid molecule in microorganisms such as bacteria and yeast can be used. For example, well-known methods for making and using nucleic acid constructs that are capable of expressing exogenous polypeptides within Rhodobacter species (e.g., R. sphaeroides and R. capsulatus) can be used. See, e.g., Dryden and Dowhan, J. Bacteriol., 178(4):1030-1038 (1996); Vasilyeva et al., Applied Biochemistry and Biotechnology, 77-79:337-345 (1999); Graichen et al., J. Bacteriol., 181(14):4216-4222 (1999); Johnson et al., J Bacteriol., 167(2):604-610 (1986); and Duport et al., Gene, 145:103-108 (1994). Further, any methods can be used to identify cells that express an amino acid sequence from a nucleic acid. Such methods are well known to those skilled in the art, and include, without limitation, immunocytochemistry, Western analysis, Northern analysis, and RT-PCR.

[0141] The cells described herein can contain a single copy, or multiple copies (e.g., about 5, 10, 20, 35, 50, 75, 100 or 150 copies), of a particular exogenous nucleic acid. For example, a bacterial cell can contain about 50 copies of exogenous nucleic acid X. In addition, the cells described herein can contain more than one particular exogenous nucleic acid. For example, a bacterial cell can contain about 50 copies of exogenous nucleic acid X as well as about 75 copies of exogenous nucleic acid Y. In these cases, each different nucleic acid can encode a different polypeptide having its own unique enzymatic activity. For example, a bacterial cell can contain two different exogenous nucleic acids such that a high level of CoQ(10) is produced. In this example, such a cell can contain a first exogenous nucleic acid that encodes a polypeptide having DXS activity and a second exogenous nucleic acid that encodes a polypeptide having DDS activity. In addition, a single exogenous nucleic acid can encode one or more than one polypeptide. For example, a single nucleic acid can contain sequences that encode three different polypeptides.

[0142] In addition to providing cells that contain an isolated nucleic acid of the invention, the invention provides cells (e.g., plant cells, animal cells, and microorganisms) that can be used to produce an isoprenoid compound such as CoQ(10). The term “microorganism” as used herein refers to all microscopic organisms including, without limitation, bacteria, algae, fungi, and protozoa. It is noted that bacteria cells can be membraneous bacteria or non-membraneous bacteria.

[0143] The term “non-membraneous bacteria” as used herein refers to any bacteria lacking intracytoplasmic membrane. The term “membraneous bacteria” as used herein refers to any naturally-occurring, genetically modified, or environmentally modified bacteria having an intracytoplasmic membrane. An intracytoplasmic membrane can be organized in a variety of ways including, without limitation, vesicles, tubules, thylalkoid-like membrane sacs, and highly organized membrane stacks. Any method can be used to analyze bacteria for the presence of intracytoplasmic membranes including, without limitation, electron microscopy, light microscopy, and density gradients. See, e.g., Chory et al., J. Bacteriol., 159:540-554 (1984); Niederman and Gibson, Isolation and Physiochemical Properties of Membranes from Purple Photosynthetic Bacteria. In: The Photosynthetic Bacteria, Ed. By Roderick K. Clayton and William R. Sistrom, Plenum Press, pp. 79-118 (1978); and Lueking et al., J. Biol. Chen., 253: 451-457 (1978). Examples of membraneous bacteria that can be used herein include, without limitation, bacteria of the Rhodospirillaceae family such as those in the genus Rhodobacter (e.g., R. sphaeroides, R. capsulatus, R. sulfidophilus, R. adriaticus, and R. veldkampii), the genus Rhodospirillum (e.g., R. rubrum, R. photometricum, R. molischianum, R. fulvum, and R. salinarum), the genus Rhodopseudomonas (e.g., R. palustris, R. viridis, and R. sulfoviridis), the genus Rhodomicrobium, the genus Rhodocyclus, and the genus Rhodopila; bacteria of the Chromatiaceae family such as those in the genus Chromatium, genus Thiocystis, the genus Thiospirillum, the genus Thiocapsa, the genus Lamprobacter, the genus Lalmprocystis, the genus Thiodictyon, the genus Amoebobacter, and the genus Thiopedia; green sulfur bacteria such as those in the genus Chlorobium and the genus Prosthecochloris; bacteria of the Methylococcaceae family such as those in the genus Methylococcus (e.g., M. capsulatus), and the genus Methylomonas (e.g., M. methanica); and particular bacteria of the Nitrobacteraceae family such as those in the genus Nitrobacter (e.g., N. winogradsky and N. hamburgensis), the genus Nitrococcus (e.g., N. miobilis), and the genus Nitrosomonas (e.g., N. europaea).

[0144] Membraneous bacteria can be highly membraneous bacteria. The term “highly membraneous bacteria” as used herein refers to any bacterium having more intracytoplasmic membrane than R. sphaeroides (ATCC 17023) cells have after the R. sphaeroides (ATCC 17023) cells have been (1) cultured chemoheterotrophically under aerobic conditions for four days, (2) cultured chemoheterotrophically under oxygen-limited conditions for four hours, and (3) harvested. The aerobic culture conditions involve culturing the cells in the dark at 30° C. in the presence of 25 percent oxygen. The oxygen-limited conditions involve culturing the cells in the light at 30° C. in the presence of 2 percent oxygen. After the four hour culturing step under oxygen-limited conditions, the R. sphaeroides (ATCC 17023) cells are harvested by centrifugation and analyzed.

[0145] Typically, any cell (e.g., membraneous bacteria) can be genetically modified such that a particular isoprenoid compound is produced. Such cells can contain exogenous nucleic acid that encodes a polypeptide having enzymatic activity. For example, a microorganism having endogenous DDS activity can be transformed with an exogenous nucleic acid that encodes a polypeptide having DDS activity. In this case, the microorganism can have increased DDS activity which can lead to an increased production of CoQ(10). Thus, a cell can be given an exogenous nucleic acid that encodes a polypeptide having an enzymatic activity that catalyzes the production of a compound normally produced by that cell. In this case, the genetically modified cell can produce more of the compound, or can produce the compound more efficiently, than a similar cell not having the genetic modification. Alternatively, a cell can be given an exogenous nucleic acid that encodes a polypeptide having an enzymatic activity that catalyzes the production of a compound that is not normally produced by that cell.

[0146] The invention provides cells containing exogenous nucleic acid that encodes a polypeptide having enzymatic activity that leads to an increased production of CoQ(10). Such cells can contain nucleic acid that encodes a polypeptide having DDS activity. Other examples include, without limitation, cells containing exogenous nucleic acid that encodes polypeptides having DXS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase (e.g., ispD), 4-diphosphocytidyl-2C-methyl-D-erythritol kinase (e.g., ispE), and/or chorismate lyase (e.g., ubiC) activity. Nucleic acid molecules that encode polypeptides having such enzymatic activities can be obtained as described herein. For example, nucleic acid encoding a polypeptide having chorismate lyase can be cloned using the sequence information provided in GenBank® accession number X66619.

[0147] Typically, microorganisms of the invention produce CoQ(10) with the yield (mg of CoQ(10) per g of dry biomass) being at least about 5 (e.g., at least about 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or more) percent greater than that of a comparable wild-type strain grown under similar conditions. Bacteria can produce more CoQ(10) when grown under anaerobic conditions as compared to aerobic conditions. For example, anaerobically cultured bacteria can produce about 3 to 4 fold more CoQ(10) than aerobically cultured bacteria of the same species. When determining the yield of isoprenoid compound production for a particular cell (e.g., microorganism), any method can be used. See, e.g., Cohen-Bazire et al., J. Cell Comp. Physiol., 49:25-68 (1957); Edlund, J. Chromatogr., 425:87-97 (1988); Rousseau and Varin, J. Chromatogr. Sci., 36:247-52 (1998); and Leray et al., J. Lipid Res., 39:2099-2105 (1998).

[0148] The invention provides a cell containing an exogenous nucleic acid that encodes a polypeptide having DXS, DDS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase (e.g., ispD), 4-diphosphocytidyl-2C-methyl-D-erythritol kinase (e.g., ispE), and/or chorismate lyase (e.g., ubiC) activity. Nucleic acid molecules that encode polypeptides having such enzymatic activities can be obtained as described herein. The invention also provides a cell that contains more than one different exogenous nucleic acid molecule with each different exogenous nucleic acid molecule encoding a polypeptide having a different one of the following enzymatic activities: DXS, DDS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase (e.g., ispD), 4-diphosphocytidyl-2C-methyl-D-erythritol kinase (e.g., ispE), and/or chorismate lyase (e.g., ubiC) activity. For example, the invention provides a cell containing a first exogenous nucleic acid encoding a polypeptide having DXS activity and a second exogenous nucleic acid encoding a polypeptide having DDS activity.

[0149] The invention provides a cell containing an exogenous nucleic acid containing a dxs sequence (e.g., Stdxs sequence), dds sequence (e.g., Stdds or Rsdds sequence), dxr sequence (e.g., Stdxr sequence), ubiC sequence (e.g., EcUbiC sequence), or lytB sequence (e.g., RsLytB sequence). Such nucleic acids can be obtained as described herein. The invention also provides a cell that contains more than one of the following sequences: a dxs sequence (e.g., Stdxs sequence), dds sequence (e.g., Stdds or Rsdds sequence), dxr sequence (e.g., Stdxr sequence), ubiC sequence (e.g., EcUbiC sequence), or lytB sequence (e.g., RsLytB sequence). For example, the invention provides a cell containing a first exogenous nucleic acid containing a dds sequence and a second exogenous nucleic acid containing a dxs sequence. Likewise, the invention provides a cell containing a single exogenous nucleic acid that contains a dds sequence and a dxs sequence.

[0150] Typically, a microorganism within the scope of the invention catabolizes a hexose carbon such as glucose. A microorganism, however, can catabolize a pentose carbon (e.g., ribose, arabinose, xylose, and lyxose). In other words, a microorganism within the scope of the invention can either utilize hexose or pentose carbon. In addition, a microorganism within the scope of the invention can use carbon sources such as methanol and/or organic acids (e.g., succinic acid or malic acid).

[0151] Any cells described herein can have reduced enzymatic activity such as reduced geranylgeranyl pyrophosphate synthase and/or magnesium protoporphyrin IX chelatase activity. Any cell described herein can have reduced biological activity such as reduced activity of aerobic repressor polypeptides (e.g., PPSR) or oxidation-reduction sensor polypeptides (e.g., CBB3). In the case of multi-subunit molecules such as CBB3, the activity of the oxidation-reduction sensor polypeptide can be reduced by inactivating one or more than one of the subunits. For example, CBB3 activity can be reduced by inactivating a single subunit of CBB3 such as the ccoN subunit.

[0152] The term “reduced” as used herein with respect to a cell and a particular activity (e.g., particular enzymatic activity) refers to a lower level of activity than that measured in a comparable cell of the same species. Thus, a R. sphaeroides cell lacking geranylgeranyl pyrophosphate synthase activity is considered to have reduced geranylgeranyl pyrophosphate synthase activity since most, if not all, comparable R. sphaeroides cells have at least some geranylgeranyl pyrophosphate synthase activity. Such reduced enzymatic activities can be the result of lower enzyme concentration, lower specific activity of an enzyme, or combinations thereof.

[0153] Many different methods can be used to make a cell having reduced enzymatic and/or biological activity. For example, a R. sphaeroides cell can be engineered to have a disrupted enzyme-encoding locus using common mutagenesis or knock-out technology. Alternatively, antisense technology can be used to reduce enzymatic activity. For example, a R. sphaeroides cell can be engineered to contain a cDNA that encodes an antisense molecule that prevents an enzyme from being made. The term “antisense molecule” as used herein encompasses any nucleic acid that contains sequences that correspond to the coding strand of an endogenous polypeptide. An antisense molecule also can have flanking sequences (e.g., regulatory sequences). Thus, antisense molecules can be ribozymes or antisense oligonucleotides. A ribozyme can have any general structure including, without limitation, hairpin, hammerhead, or axhead structures, provided the molecule cleaves RNA.

[0154] Cells having a reduced enzymatic and/or biological activity can be identified using any method. For example, a R. sphaeroides cell having reduced geranylgeranyl pyrophosphate synthase activity can be easily identified using common biochemical methods that measure geranylgeranyl pyrophosphate synthase activity. See, e.g., Math et al., Proc. Natl. Acad. Sci. USA, 89(15):6761-6764 (1992).

[0155] The invention provides a cell containing reduced geranylgeranyl diphosphate synthase, aerobic repressor, and/or cbb3-type cytochrome oxidase activity. Such cells can have reduced geranylgeranyl diphosphate synthase, aerobic repressor, and/or cbb3-type cytochrome oxidase activity as a result of disrupting the endogenous sequences that encode polypeptides having these activities. For example, a cell can have reduced geranylgeranyl diphosphate synthase activity as a result of knocking out a portion of the endogenous crtE sequence within a cell's genome; a cell can have reduced aerobic repressor activity as a result of knocking out a portion of the endogenous ppsR sequence within a cell's genome; and a cell can have reduced cbb3-type cytochrome oxidase activity as a result of knocking out a portion of the endogenous ccoN sequence within a cell's genome.

[0156] The invention also provides a cell containing non-functional crtE, ppsR, and/or ccoN nucleic acid sequences within its genome such that the encoded polypeptide is either mutated or not expressed. Such cells can be used to produce large amounts of CoQ(10). The sequence of crtE can be as set forth in Genbank® accession number AJ010302. The sequence of ppsR can be as set forth in Genbank® accession number AJ010302 or L19596. The sequence of ccoN can be as set forth in Genbank® accession number U58092. Knockout technology can be used to make cells containing non-functional crtE, ppsR, and/or ccoN nucleic acid sequences.

[0157] 4. Producing Isoprenoid Compounds

[0158] The cells described herein can be used to produce isoprenoid compounds. For example, a microorganism having endogenous DDS activity can be transformed with nucleic acid that encodes a polypeptide having DDS activity such that the microorganism produces more CoQ(10) than had the microorganism not been given that nucleic acid. Once transformed, the microorganism can be used cultured under conditions optimal for CoQ(10) production.

[0159] In addition, substantially pure polypeptides having enzymatic activity can be used alone or in combination with cells to produce isoprenoid compounds. For example, a preparation containing a substantially pure polypeptide having DDS activity can be used to catalyze the formation of CoQ(10). Further, cell-free extracts containing a polypeptide having enzymatic activity can be used alone or in combination with substantially pure polypeptides and/or cells to produce isoprenoid compounds. For example, a cell-free extract containing a polypeptide having DXS activity can be used to form 1-deoxyxyulose-5-phosphate, while a microorganism containing polypeptides have the enzymatic activities necessary to catalyze the reactions needed to form CoQ(10) from 1-deoxyxyulose-5-phosphate can be used to produce CoQ(10). Any method can be used to produce a cell-free extract. For example, osmotic shock, sonication, and/or a repeated freeze-thaw cycle followed by filtration and/or centrifugation can be used to produce a cell-free extract from intact cells.

[0160] It is noted that a cell, substantially pure polypeptide, and/or cell-free extract can be used to produce a particular isoprenoid compound that is, in turn, treated chemically to produce another compound. For example, a microorganism can be used to produce CoQ(10), while a chemical process is used to modify CoQ(10) into a CoQ(10) derivative such as CoQ10 containing a polar group. Likewise, a chemical process can be used to produce a particular compound that is, in turn, converted into an isoprenoid compound using a cell, substantially pure polypeptide, and/or cell-free extract described herein. For example, a chemical process can be used to produce deoxyxylose-5-phosphate, while a microorganism can be used convert deoxyxylose-5-phosphate into CoQ(10).

[0161] Typically, a particular isoprenoid compound is produced by providing a microorganism and culturing the provided microorganism with culture medium such that that isoprenoid compound is produced. In general, the culture media and/or culture conditions can be such that the microorganisms grow to an adequate density and produce the desired compound efficiently. For large-scale production processes, the following methods can be used. First, a large tank (e.g., a 100 gallon, 200 gallon, 500 gallon, or more tank) containing appropriate culture medium with, for example, a glucose carbon source is inoculated with a particular microorganism. After inoculation, the microorganisms are incubated to allow biomass to be produced. Once a desired biomass is reached, the broth containing the microorganisms can be transferred to a second tank. This second tank can be any size. For example, the second tank can be larger, smaller, or the same size as the first tank. Typically, the second tank is larger than the first such that additional culture medium can be added to the broth from the first tank. In addition, the culture medium within this second tank can be the same as, or different from, that used in the first tank. For example, the first tank can contain medium with xylose, while the second tank contains medium with glucose.

[0162] Once transferred, the microorganisms can be incubated to allow for the production of the desired isopreniod compound. Once produced, any method can be used to isolate the desired compound. For example, if the microorganism releases the desired isoprenoid compound into the broth, then common separation techniques can be used to remove the biomass from the broth, and common isolation procedures (e.g., extraction, distillation, and ion-exchange procedures) can be used to obtain the isoprenoid compound from the microorganism-free broth. In addition, the desired isoprenoid compound can be isolated while it is being produced, or it can be isolated from the broth after the product production phase has been terminated. If the microorganism retains the desired isoprenoid compound, then the biomass can be collected and treated to release the isoprenoid compound, and the released isoprenoid compound can be isolated.

[0163] The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.

EXAMPLES Example 1 Cloning Nucleic Acid that Encodes a Sphingomonas trueperi Polypeptide Having Dxs Activity

[0164]S. trueperi cells were obtained from the American Type Culture Collection (ATCC Cat. No. 12417). To isolate bacterial genomic DNA, cells were grown in 100-200 mL cultures for 2-3 days at 30° C. on a shaker rotating at 250 rpm. Cultured cells were centrifuged to form a cell pellet, washed by resuspending the pellet in a solution of 10 mM Tris/1 mM EDTA, and centrifuged again as before. The cell pellets were resuspended in 5 mL of GTE buffer per 100 mL of original culture. GTE buffer is 50 mM glucose/25 mM Tris-HCl (pH 8.0)/10 mM EDTA (pH 8.0). The bacterial cell walls were lysed by adding lysozyme (final concentration of 1 mg/mL), Proteinase K (final concentration of 1 mg/mL), and mutanolysin (final concentration of 5.5 μg/mL) to the resuspended cell solution to form a lysing mixture that was incubated for 90 minutes at 37° C. After this incubation, sodium dodecyl sulfate was added to the mixture to a final concentration of 1 percent, and additional Proteinase K was added until the concentration in the solution was 2 mg/mL. After a 1 hour incubation at 50° C., the solution containing the lysed cells was diluted 1:1 with fresh GTE buffer. Once diluted, sodium chloride was added to the solution to a final concentration of 0.15 M. Polypeptides and molecules other than nucleic acids were removed from the lysed bacterial cell solution by adding an equal volume of an organic mixture made up of phenol, chloroform, and isoamyl alcohol at a ratio of 25:24:1 (hereinafter referred to as PCIA). After adding PCIA, the solution was mixed. To separate the organic phase from the DNA-containing aqueous phase, the mixture was centrifuged at 12,000×g for 10 minutes. The aqueous phase was transferred to a clean tube and re-extracted with an equal volume of chloroform alone. The aqueous and organic phases were separated by centrifugation at 3,000×g for 10 minutes. The aqueous phase was again removed to a new tube and treated with 2.5 mg of RNase to degrade any bacterial RNA present. The purified DNA was recovered by adding 2.5 volumes of ethanol to the aqueous phase. After mixing the solution, the precipitated DNA was removed by spooling it on a glass rod. The spooled DNA was rinsed with 70 percent ethanol. Once rinsed, the ethanol was allowed to evaporate by leaving the DNA exposed to the air until dry. The dried DNA was resuspended in a solution of 10 mM Tris (pH 8.5). The resuspended DNA was re-extracted with PCIA followed by chloroform alone as before. The DNA was re-precipitated by adding one-tenth volume of 7.5 M ammonium acetate and 2.5 volumes ethanol, followed by spooling, rinsing, and air drying. The purified DNA was resuspended in 10 mM Tris (pH 8.5).

[0165] The following polymerase chain reaction (PCR) procedure was used to isolate nucleic acid that encodes a S. trueperi polypeptide having DXS activity. Three degenerate forward PCR primers (F1, F2, and F3) and three degenerate reverse PCR primers (R1, R2, and R3) were designed by comparing sequences of several clones that encode polypeptides have DXS activity (FIG. 15). The sequence of each degenerate primer was as follows: F1: 5′-RTKATTYTMAAYGAYAAYGAAATG-3′ (SEQ ID NO:53) F2: 5′-TTTGAAGARYTVGGYWTTAACTA-3′ (SEQ ID NO:54) F3: 5′-RCAYCARGCTTAYSCVCAYAA-3′ (SEQ ID NO:55) R1: 5′-CGTGYTGYTCDGCRATHGCBAC-3′ (SEQ ID NO:56) R2: 5′-TGYTCDGCRATHGCBACRTCRAA-3′ (SEQ ID NO:57) R3: 5′-GGSCCDATRTAGTTAAWRCC-3′ (SEQ ID NO:58)

[0166] The primers were used in all logical combinations in PCR using Taq polymerase (Roche Molecular Biochemicals, Indianapolis, Ind.) and 1 ng of purified genomic DNA per microliter of reaction nix. Each PCR reaction was conducted using a touchdown PCR program with four cycles at each of the following annealing temperatures: 60° C., 58° C., 56° C., and 54° C., followed by 25 cycles at 52° C. Each cycle had an initial 30 second denaturing step at 94° C. and a 90 second extension step at 72° C. The program had an initial denaturing step of 2 minutes at 94° C. and final extension step of 5 minutes at 72° C.

[0167] Between about 2 μM and 12 μM of each PCR primer was used in each reaction, depending on the degree of degeneracy. After each PCR reaction was complete, a portion of each reaction was separated by gel electrophoresis using a 1.5 percent TAE (Tris-acetate-EDTA) agarose gel. The results from the gel electrophoresis indicated that the combination of degenerate primer F3 with degenerate primer R2 produced a nucleic acid molecule of 882 bp (referred to as the F3R2 fragment). The F3R2 fragment was purified away from the agarose gel matrix using the Qiagen Gel Extraction procedure according to the manufacturer's instructions (Qiagen Inc., Valencia, Calif.). A portion of the purified fragment was ligated into the pCRII-TOPO vector. The vector containing the F3R2 fragment was inserted into E. coli TOP10 cells using the TOPO cloning procedure (Invitrogen, Carlsbad, Calif.). The transformed TOP10 cells were plated onto LB agar plates containing 100 μg/mL of ampicillin (Amp) and 50 μg/mL of 5-Bromo-4-Chloro-3-Indolyl-β-D-Galactopyranoside (Xgal). Single white colonies were re-plated onto fresh LB-Amp-Xgal plates and screened by PCR with the F3 and R2 primers to confirm the presence of plasmids with the desired insert. Plasmid DNAs were obtained from bacterial colonies using the QiaPrep Spin Miniprep Kit (Qiagen, Inc). The plasmid DNAs were then quantified and sequenced with the M13 forward and reverse primers. Sequence analysis indicated that the sequence of the F3R2 fragment aligned with sequences from other nucleic acid molecules that encode polypeptides having DXS activity.

[0168] To obtain the complete coding sequence for the S. trueperi polypeptide having DXS activity, genome walking was performed as follows. Primers were designed based upon the sequence of the 882 bp F3R2 fragment for walking in both the upstream and downstream directions. These walking primers had the following sequences: GSP1F: 5′-TCGTGACCAAGAAGGGCAAGGGCTATG-3′ (SEQ ID NO:59) GSP2F: 5′-GACAAGTATCACGGCGTCCAGAAGTTC-3′ (SEQ ID NO:60) GSP1R: 5′-ATAGCCCTTGCCCTTCTTGGTCACGAC-3′ (SEQ ID NO:61) GSP2R: 5′-CGAACGGATCATACTCGCTCTCGCTG-3′ (SEQ ID NO:62)

[0169] The GSP1F and GSP2F primers are primers that face downstream of the DXS polypeptide start codon, while the GSP1R and GSP2R primers are primers that face in the opposite direction. In addition, GSP2F and GSP2R are nested inside of the GSP1F and GSP1R primers. Genome walking was conducted according to the manual of CLONTECH's Universal Genome Walking kit (CLONTECH Laboratories, Inc., Palo Alto, Calif.) with the exception that Fsp I and Sma I were used instead of Dra I and EcoR V. The genomic DNA used was from S. trueperi. DMSO was added to the PCR mixture until a final concentration of 5 percent was reached. The PCR reactions were performed using a Perkin Elmer 9700 Thermocycler. The first round of PCR consisted of 7 cycles of 2 seconds at 94° C. and 3 minutes at 72° C., followed by 36 cycles of 2 seconds at 94° C. and 3 minutes at 67° C., with a final extension at 67° C. for 4 minutes. The second round of PCR consisted of 5 cycles of 2 seconds at 94° C. and 3 minutes at 72° C., followed by 24 cycles of 2 seconds at 94° C. and 3 minutes at 67° C., with a final extension at 67° C. for 4 minutes. After the PCR was complete, a portion of the reaction mix from each round was separated by gel electrophoresis using a 1.5 percent TAE agarose gel. Good amplification products were obtained with the Pvu II and Stu I libraries using the GSP1F and GSP2F primers and with the Fsp I and Pvu II libraries using the GSP1 R and GSP2R primers. The second round products from each of these libraries were gel purified, cloned using the TOPO cloning procedure (Invitrogen, Carlsbad, Calif.), and sequenced. A 1.7 kilobase (kb) fragment was subcloned from the Pvu IIF library, a 2.8 kb fragment was subcloned from the Stu IF library, a 400 bp fragment was subcloned from the Fsp IR library, and a 330 bp fragment was subcloned from the Pvu IIR library. Each of these subcloned fragments was sequenced. Sequence analysis indicated that each subcloned fragment contained a sequence that overlapped with that of the F3R2 fragment and was similar to other nucleic acid sequences that encode polypeptides having DXS activity.

[0170] Because the sequence information obtained by genome walking extended 13 bp upstream of the translational start codon, a second genome walk was conducted to gain additional sequence information. This second walk used GSPB2R, 5′-TGAGGATCTTGTGCGGATAGC-ATTGGTG-3′ (SEQ ID NO:63) as the first round primer and GSPB3R, 5′-AGCGGCGTCTTG-GGTAGGTCAGCCAT-3′ (SEQ ID NO:64) as the second round primer. The second walk was conducted using only the Sma I and Stu I libraries. CLONTECH's Advantage-GC Genomic Polymerase was used for PCR with a 1.0 mM GC Melt concentration according to the manufacturer's specifications. The first round of PCR was conducted using a Perkin Elmer 9700 Thermocycler with an initial denaturing step at 96° C. for 5 seconds followed by 7 cycles consisting of 2 seconds at 94° C. and 3 minutes at 72° C., followed by 36 cycles consisting of 2 seconds at 94° C. and 3 minutes at 66° C., with a final extension at 66° C. for 4 minutes. The second round of PCR had 5 cycles consisting of 2 seconds at 94° C. and 3 minutes at 72° C., followed by 26 cycles consisting of 2 seconds at 94° C. and 3 minutes at 66° C., with a final extension at 66° C. for 4 minutes. Portions of the PCR products from each round were separated by gel electrophoresis using a 1.5 percent TAE agarose gel. The gel electrophoresis revealed the presence of a 250 bp amplification product obtained from the second round of PCR using the Stu I library. This fragment was gel purified, cloned using the TOPO cloning procedure (Invitrogen, Carlsbad, Calif.), and sequenced. An overlap with the previously obtained sequence was found, extending the length of the clone to 181 bp before the start codon. The full-length clone containing coding and non-coding sequence was 3626 bp in length (FIG. 2). The open reading frame was 1926 bp in length (FIG. 3), which encoded a polypeptide with 641 amino acid residues (FIG. 4).

[0171] The coding sequence of the DXS polypeptide was amplified by PCR using S. trueperi genomic DNA as template. Primers were designed based on the sequence obtained above. The sequences of the primers were as follows: SHDXF1: 5′-ATATGGTACCGTGTGACTGACCTGTCCAAC-3′ (SEQ ID NO:65) SHDXR1: 5′-AGTCTCTAGAATGTTGGAGATTCAAGGTGG-3′ (SEQ ID NO:66)

[0172] These primers were designed to introduce a Kpn I restriction site at the beginning of the amplified fragment and an Xba I restriction site at the end of the amplified fragment. The sequence of each restriction site is underlined. The PCR reaction mix contained the following: 100 ng genomic DNA, 2 μL of each primer (SHDXF1 and SHDXR1, each at 50 μM), 10 μL 10×Pfu Plus buffer, 5 μL DMSO, 8 μL dNTPs (10 μM each) and 5 units Pfu polymerase in a final volume of 100 μL. Each PCR reaction was performed in a Perkin Elmer Geneamp PCR system 2400 under the following conditions: an initial denaturation at 94° C. for 5 minutes; 8 cycles of (1) 94° C. for 45 seconds, (2) 55° C. for 45 seconds, and (3) 72° C. for 3 minutes; 21 cycles of (1) 94° C. for 45 seconds, (2) 61° C. for 45 seconds and (3) 72° C. for 3 minutes; and a final extension of 72° C. for 10 minutes. A portion of the PCR reaction was separated by gel electrophoresis using a 0.8 percent TAE gel. The gel electrophoresis revealed a 1.6 kb fragment. This fragment was (1) purified using a Qiagen Gel Extraction kit (Qiagen Inc., Valencia, Calif.), (2) treated with Kpn I and Xba I (New England BioLabs, Inc., Beverly, Mass.), and (3) subcloned into pUC18 that had also been treated with Kpn I and Xba I and gel purified. The resulting construct designated appUC18-SHDXS is depicted in FIG. 18. The ligation was carried out with T4 DNA ligase at 16° C. for 16 hours. Once ligated, 1 μL was used to electroporate E. coli ElectroMAX™ DH10B™ cells (Life Technologies, Inc., Rockville, Md.). The electroporated cells were plated on LB-Amp plates (Amp concentration=100 μg/mL). From these plates, eight individual colonies were chosen at random. The plasmid was isolated from each colony using a QiaPrep Spin Miniprep Kit (Qiagen Inc., Valencia, Calif.). The extracted plasmid DNA was examined for the presence of the 1.6 kb fragment by digesting individual aliquots with one of three different restriction enzymes: EcoR I, BamH I, and Nar I. If the plasmids contained the correct 1.6 kb fragment, the EcoR I digest reaction would result in two fragments (0.77 and 4.13 kb), the BamH I digest reaction would result in one fragment (4.8 kb), and the Nar I digest reaction would result in two fragments (1.9 and 2.9 kb). After treating with the restriction enzymes, the digest reactions were separated by gel electrophoresis using a 0.8 percent TAE agarose gel. All 8 clones yielded digestion fragments consistent with a clone of 1.6 kb.

Example 2 Introducing Nucleic Acid that Encodes a Polypeptide Having DXS Activity into Cells

[0173] The nucleic acid molecule that encodes a polypeptide having DXS activity and was obtained as described in Example 1 is introduced into cells as follows. First, a construct is made to contain the nucleic acid molecule such that the encoded polypeptide having DXS activity is expressed in a desired host cell. When using prokaryotic cells, a construct functional in prokaryotic cells is used. When using eukaryotic cells, a construct functional in eukaryotic cells is used. Second, the construct is introduced into the desired host cell using appropriate methods. Once introduced, stable transformants are selected.

Example 3 Cloning Nucleic Acid that Encodes a Rhodobacter sphaeroides Polypeptide Having DDS Activity

[0174]R. sphaeroides ATCC strain 17023 cells were grown in 550 R 8 A H media at 30° C. and 100 rpm. The recipe for 550 R 8 A H media was provided by ATCC. Genomic DNA was isolated from R. sphaeroides cells as described in Example 1.

[0175] To isolate nucleic acid encoding an R. sphaeroides polypeptide having DDS activity, degenerate primers were designed and used as described in Example 1. Briefly, three degenerate forward primers (F4, F5, and F6) and four degenerate reverse primers (R4, R5, R6, and R7) were designed by comparing sequences of several clones that encode polypeptides have DDS, SDS, or ODS activity (FIG. 16). The sequence of each degenerate primer was as follows: F4: 5′-GGWGGHAARMGMMTKCGYCC-3′ (SEQ ID NO:67) F5: 5′-ACWYTGSTDCATGATGATGT-3′ (SEQ ID NO:68) F6: 5′-ACNYTNBTNCAYGAYGAYGT-3′ (SEQ ID NO:69) R4: 5′-TYRTCYACSACATCATCATG-3′ (SEQ ID NO:70) R5: 5′-TGHAVKACYTCACCYTCRGMAAT-3′ (SEQ ID NO:71) R6: 5′-TARTCNARDATRTCRTCDAT-3′ (SEQ ID NO:72) R7: 5′-TCRTCNCCNAYNKTYTTNCC-3′ (SEQ ID NO:73)

[0176] These primers were used in all logical combinations in PCR using Taq polymerase (Roche Molecular Biochemicals, Indianapolis, Ind.) and 1 ng of genomic DNA per microliter of reaction mix. PCR was conducted using the touchdown PCR program as described in Example 1. Between about 4 μM and 8 μM of each PCR primer was used in each reaction, depending on the degree of degeneracy. After each PCR reaction was complete, a portion of each reaction was separated by gel electrophoresis using a 1.5 percent TAE agarose gel. The results from the gel electrophoresis yielded no fragments of the expected size. A second amplification reaction was then performed using each sample from the first round of PCR. Briefly, one μL of reaction mixture from each first round of PCR was used in a 50 μL amplification reaction using the same primer pairs and thermocycling parameters used in the first round of PCR. A portion of each of the second round PCR reactions was separated by gel elecrophoresis using a 1.5 percent TAE agarose gel. The combination of degenerate primers F6 and R5 produced a fragment of 209 bp (referred to as the F6R5 fragment). The F6R5 fragment was isolated from an agarose gel and purified using the Qiagen Gel Extraction procedure (Qiagen Inc., Valencia, Calif.). An aliquot of the purified fragment was ligated to pCRII-TOPO, and the product of the ligation reaction was inserted into TOP10 E. coli cells using a TOPO cloning procedure (Invitrogen, Carlsbad, Calif.). The products of the individual insertion reactions were plated onto LB media containing 100 μg/mL Amp and 50 μg/mL Xgal. Single white colonies that grew on the LB-Amp-Xgal plates were re-plated onto fresh LB-Amp plates and screened in a PCR reaction using the F6 and R5 primers to confirm the presence of the desired insert. Plasmid DNAs were obtained from several colonies using a QiaPrep Spin Miniprep kit (Qiagen, Inc). The obtained plasmid DNAs were quantified and sequenced with the M13 forward and reverse primers. Sequence analysis revealed that the F6R5 fragment contained sequences that aligned with sequences from other nucleic acid molecules that encode polypeptides having polyprenyl diphosphate synthase activity.

[0177] Genome walling was performed to obtain a complete coding sequence for the R. sphaeroides DDS polypeptide using procedures similar to those described in Example 1. Briefly, primers were designed based on the sequence of the F6R5 fragment for walking in both the upstream and downstream directions. These primers had the following sequences: GSP3F: 5′-TGGAAGCTGCGGGCGAAGAGATAGTC-3′ (SEQ ID NO:74) GSP4F: 5′-CCCACCAGCACCGAGGATTTGTTGTC-3′ (SEQ ID NO:75) GSP3R: 5′-GAACCTGCTGTGGGACAACAAATCCTC-3′ (SEQ ID NO:76) GSP4R: 5′-TCGGTGCTGGTGGGCGACTATCTCTTC-3′ (SEQ ID NO:77)

[0178] The GSP3F and GSP4F primers are primers that face downstream of the DDS polypeptide start codon, while the GSP3R and GSP4R primers are primers that face in the opposite direction. In addition, the GSP4F and GSP4R primers are nested inside the GSP3F and GSP3R primers.

[0179] The Pvu II, Fsp I, and Stu I libraries with the GSP3F and GSP4F primers and all four libraries with the GSP3R and GSP4R primers resulted in the production of amplified fragments. A 750 bp fragment from the Pvu I library, a 500 bp fragment from the Fsp I library, a 1.4 kb fragment from the Stu I library, and a 0.9 kb fragment from the Sma I library were all subcloned and sequenced. Sequence analysis indicated that each subcloned fragment contained a sequence that overlapped with the sequence of the F6R5 fragment and was similar to other nucleic acid sequences that encode polypeptides having polyprenyl diphosphate synthase activity. The full-length clone containing coding and non-coding sequence was 1990 bp in length (FIG. 7). The open reading frame was 1002 bp in length (FIG. 8), which encoded a polypeptide with 333 amino acid residues (FIG. 9).

[0180] The coding sequence of the DDS polypeptide from R. sphaeroides was amplified by PCR using R. sphaeroides genomic DNA as template. PCR primers were designed based on the sequences obtained as described above. The sequences of the primers were as follows. RDS18F: 5′-ACTAGAATTCCGCAACAGTTCCTTCATGTC-3′ (SEQ ID NO:78) RDS18R: 5′-ATAGAAGCTTACTTGCGGTCGGACTGATAG-3′ (SEQ ID NO:79)

[0181] These primers were designed to introduce an EcoR I restriction site at the beginning of the amplified fragment and a Hind III restriction site at the end of the amplified fragment. The sequence of each restriction site is underlined. The PCR reaction mix contained the following: 100 ng genomic DNA, 2 μL of each primer (RDS18F and RDS18R, each at 50 μM), 10 μL 10×Pfu Plus buffer, 5 μL DMSO, 8 μL dNTPs (10 mM each) and 5 units Pfu polymerase in a final volume of 100 μL. Each PCR reaction was performed in a Perkin Elmer Geneamp PCR system 2400 under the following conditions: an initial denaturation at 94° C. for 5 minutes; 8 cycles of (1) 94° C. for 45 seconds, (2) 55° C. for 45 seconds, and (3) 72° C. for 3 minutes; 21 Cycles of (1) 94° C. for 45 seconds, (2) 61° C. for 45 seconds, and (3) 72° C. for 3 minutes; and a final extension of 72° C. for 10 minutes. After completing the PCR reactions, each PCR reaction was separated by gel electrophoresis using a 0.8 percent TAE agarose gel. The gel electrophoresis revealed a 1.6 kb fragment. This fragment was (1) purified from the agarose gel using a Qiagen Gel Extraction kit, (2) digested with EcoR I and Hind III (New England BioLabs, Beverly, Mass.), and (3) ligated to pUC18 that had also been digested with EcoR I and Hind III and gel purified. The resulting construct designated appUC18-RSdds is depicted in FIG. 19. The ligation was carried out with T4 DNA ligase at 16° C. for 16 hours. Once ligated, one μL of the ligation reaction was used to electroporate E. coli ElectroMAX™ DH10B™ cells (Life Technologies, Inc., Rockville, Md.). The electroporated cells were plated onto LB-Amp plates (Amp concentration was 100 μg/mL). From these LB-Amp plates, eight individual colonies were selected at random, and the plasmids within these colonies were purified using a Qiaprep Spin Miniprep kit. These purified plasmids were evaluated for the presence of inserts by restriction enzyme analysis. If the plasmids contained the correct 1.6 kb fragment, then an EcoR I and Hind III digest reaction would result in two fragments (2.6 and 1.6 kb), and a BamH I digest reaction would result in one fragment (4.2 kb). After treating with the restriction enzymes, the digest reactions were separated by gel electrophoresis using a 0.8 percent TAE agarose gel. Of the eight clones tested, four contained the desired 1.6 kb fragment.

Example 4 Cloning Nucleic Acid that Encodes a Sphingomonas trueperi Polypeptide Having DDS Activity

[0182]S. trueperi cells were grown as described in Example 1. In addition, genomic DNA was isolated from S. trueperi cells as described in Example 1.

[0183] To isolate nucleic acid encoding a polypeptide having DDS activity from S. trueperi, a strategy similar to that described in Example 3 was employed. In this case, four degenerate forward primers (SF1, SF2, SF3, and SF4) and four degenerate reverse primers (SR1, SR2, SR3, and SR4) were designed comparing sequences of several clones that encode polypeptides having polyprenyl diphosphate synthase activity (FIG. 17). Codon usage tables from twelve Sphingomonas species were used to develop an average preferred codon table that was used in primer design. The sequence of each degenerate primer was as follows: SF1: 5′-CTSSTSCAYGAYGAYGTSGTSGA-3′ (SEQ ID NO:80) SF2: 5′-GTSGMVGSSGGSGGSAARC-3′ (SEQ ID NO:81) SF3: 5′-CTSMTSCAYGAYGAYGTS-3′ (SEQ ID NO:82) SF4: 5′-DSSRTBCTSGTSGGSGAYTT-3′ (SEQ ID NO:83) SR1: 5′-VAKRAARTCSCCSACSAGSAC-3′ (SEQ ID NO:84) SR2: 5′-SACYTCSCCYTCSGCRAT-3′ (SEQ ID NO:85) SR3: 5′-RTCRTCSCCVAYVKTYTTSCC-3′ (SEQ ID NO:86) SR4: 5′-SGGSAGSGTVRBYTTSCCYTC-3′ (SEQ ID NO:87)

[0184] The primers were used in all logical combinations in PCR using Taq polymerase (Roche Molecular Biochemicals, Indianapolis, Ind.) and 1 ng of genomic DNA per microliter of reaction mix. PCR was conducted using the touchdown PCR program as described in Example 1. Between about 4 μM and 20 μM of each PCR primer was used in each reaction depending on the degree of degeneracy. After each PCR reaction was complete, a portion of each reaction was separated by gel electrophoresis using a 1.5 percent TAE agarose gel. Each PCR reaction produced several amplified fragments of the expected sizes based on the coding sequences of other polyprenyl diphosphate synthase polypeptides. These fragments were isolated from TAR agarose gels and purified using the Qiagen Gel Extraction procedure (Qiagen Inc., Valencia, Calif.). An aliquot of each purified fragment was ligated into pCRII-TOPO. The ligated plasmids were then inserted into TOP10 E. coli cells using a TOPO cloning procedure (Invitrogen, Carlsbad, Calif.). The products of each of the individual insertion reactions were plated on LB-Amp-Xgal plates as described in Examples 1 and 3. Single white colonies that grew on the LB-Amp-Xgal plates were re-plated onto fresh LB-Amp-Xgal plates and screened in a PCR reaction using the initial degenerate primers to confirm the presence of the desired insert. Plasmid DNAs having the desired insert were obtained from multiple colonies using a QiaPrep Spin Miniprep Kit (Qiagen, Inc). The obtained plasmid DNAs were then quantified and sequenced using the Ml 3 forward and reverse primers. Sequence analysis revealed that a 201 bp fragment produced using the SF1 and SR2 degenerate primers, a 476 bp fragment produced using the SF1 and SR4 primers, and a 206 bp fragment produced using the SF3 and SR2 primers contained sequences similar to the coding sequences of other polyprenyl diphosphate synthases.

[0185] Genome walking was performed to obtain a complete coding sequence for the S. trueperi DDS polypeptide using procedures similar to those described in Example 1. Briefly, primers were designed based on the sequences of the obtained fragments. These primers had the following sequences: GSP5F: 5′-GTGCTGGTCGGCGACTTCCTGTTCAG-3′ (SEQ ID NO:88) GSP6F: 5′-ATCGACCTGTCCGAGGATCGCTATCTC-3′ (SEQ ID NO:89) GSP5R: 5′-TCGAACGAGCGGCTGAACAGGAAGTC-3′ (SEQ ID NO:90) GSP6R: 5′-TGGCGGGATTGCCCCAGATGATGTTG-3′ (SEQ ID NO:91)

[0186] The GSP5F and GSP6F primers are primers that face downstream of the DDS start codon, while the GSP5R and GSP6R primers are primers that face in the opposite direction. In addition, the GSP6F and GSP6R primers are nested inside the GSP5F and GSP5R primers.

[0187] Genome walking was conducted as described in Example 3 with the exception that the 36 cycles had 3 minute incubations at 66° C. instead of 67° C. and the final extension was performed at 66° C. instead of 67° C. for both the first and second rounds of PCR. Portions of the PCR reactions from each round were separated by gel electrophoresis using a 1.5 percent TAE agarose gel. PCR on the Fsp I and Stu I libraries with the forward primers and of all four libraries with the reverse primers resulted in the production of an amplified fragment. A 1.4 kb fragment from the Fsp I library, a 1.1 kb fragment from the Stu I library (forward primer), a 2.0 kb fragment from the Pvu II library (forward primer), and a 3.0 kb fragment from the Stu I library (reverse primer) were gel purified, cloned using the TOPO cloning procedure, and sequenced as described in Examples 1 and 3. The sequencing analysis revealed that these fragments contained sequences that overlapped with the sequence of the initially obtained fragments and were similar to the coding sequences of other polyprenyl diphosphate synthases. The full-length clone containing coding and non-coding sequence was 1833 bp in length (FIG. 10). The open reading frame was 1014 bp in length (FIG. 11), which encoded a polypeptide with 337 amino acid residues (FIG. 12).

[0188] The coding sequence of the DDS polypeptide from S. trueperi was amplified by PCR using S. trueperi genomic DNA as template. PCR primers were designed based on the sequences obtained as described above. The sequences of the primers were as follows. SHDDSF: 5′-ATTAGGTACCATCAGATAATCGTCGCTCAA-3′ (SEQ ID NO:92) SHDDSR: 5′-TATAGGATCCGACATGGACGAGGAAGACGC-3′ (SEQ ID NO:93)

[0189] These primers were designed to introduce a Kpn I restriction site at the beginning of the amplified fragment and a BamH I restriction site at the end of amplified fragment. The sequence of each restriction site is underlined. The PCR reactions were performed as described in Example 3 with the exception that primers SHDDSF and SHDDSR were used instead of RDS 18F and RDS 18R. Once the PCR was completed, the PCR reactions were separated by gel electrophoresis using a 0.3 percent TAE agarose gel. The gel electrophoresis revealed a 1.6 kb fragment. This 1.6 kb fragment was (1) purified using a Qiagen Gel Extraction kit, (2) digested with Kpn I and BamH I (New England BioLabs), and (3) ligated into pUC18 that had also been digested with Kpn I and BamH I and gel purified using methods similar to those described in Example 3. The resulting construct designated appUC18-SHDDS is depicted in FIG. 20. This construct was used to transform cells as described in Example 3. The transformed cells were plated onto LB-Amp plates, and eight individual colonies were selected at random. Plasmid DNA was isolated from each colony using a QiaPrep Spin Miniprep kit. The extracted plasmid DNA was tested for the presence of the 1.6 kb fragment using three different restriction digests. If the plasmids contained the 1.6 kb fragment, then a BamH I and Kpn I digest would yield two fragments (2.68 and 1.62 kb), an EcoR I digest would yield two fragments (1.45 and 2.85 kb), and a Ban II digest would yield two fragments (0.48 and 3.8 kb). All eight plasmids tested yielded digestion fragments consistent with a plasmid containing the desired 1.6 kb fragment.

Example 5 Measuring CoQ(10)

[0190] Harvested cells were suspended in water to have about 0.1 gm dry weight per mL. The suspension was subjected to a French-press, and the resulting in suspension was frozen in 1 mL aliquots until used.

[0191] To measure CoQ(10) in a sample, two aliquots were repeatedly thawed and refrozen 4-5 times. Once transferred to a 50 mL centrifuge tube, 1 mL of 5% sodium dodecyl sulfate was added to the thawed material. The material was then flushed with nitrogen. After vortexing for one minute, six ML of ethanol was added to the material, and the resulting mixture was vortexed for one minute. Then, 15 mL of hexane was added to the mixture. After vortexing for five minutes, the mixture was centrifuged at 3000 rpm for ten minutes. Once centrifuged, the hexane layer was removed to a conical flask and flushed with nitrogen. This hexane extraction was repeated two times. The three extracts were pooled into a single tube that was evaporated on a vacuum evaporator until the residue was near dryness. The residue was dissolved in 2 mL of mobile phase by vortexing for 2-3 minutes. Once vortexed, the solution was transferred to a 5 mL volumetric flask. The tube that contained the residue was rinsed two additional times with 1 mL of mobile phase. Each time the rinse solution was transferred to the same 5 mL volumetric flask. After adjusting the total volume to 5 mL, the solution was mixed well and stored at −20° C. until analyzed.

[0192] As a control, either water or a culture solution was spiked with standard CoQ(10), extracted as indicated above, and analyzed to determine the recovery of the spiked material. The CoQ(10) standard was a stock solution of CoQ(10), obtained from Sigma. The stock solution was made in HPLC grade ethanol at a concentration of 100 μg/mL, and then diluted to get CoQ(10) solutions ranging from 100 μg/mL to 1 μg/mL.

[0193] HPLC analysis was performed with the following parameters. The mobile phase was ethanol:methanol (7:3) or methanol:isopropylether (9:1). The flow rate was 0.75 mL/min. The column was Waters Nova-Pak C18 (3.9×150 mm; 4Um). The detector was a PDA set from 200-300 nm with the resolution at 1.2 m and the maximum absorbance at 275 nm. The run time was 15 minutes, and the injection volume was 50 μL. To calculate the amount of CoQ(10) present, 50 μL of each sample was injected, and the results compared to those obtained using the calibration curve. From these data points, the concentration per gm dry weight was calculated.

Example 6 Introducing Nucleic Acid that Encodes a Polypeptide Having DDS Activity into Cells and Measuring Isoprenoid Levels

[0194] The following procedures were followed individually for the R. sphaeroides and S. trueperi nucleic acid isolated as described in Examples 3 and 4, respectively.

[0195] Plasmid DNA encoding the polypeptide having DDS activity was electroporated into wild type E. coli strain MG1655. The electroporated cells were plated onto LB-Amp plates. A single individual bacterial colony was picked for each DDS coding sequence, and each colony was grown overnight in 2 mL of LB-Amp at 37° C. with 200 rpm shaking. About 0.75 mL of these overnight cultures were used to inoculate flasks containing 75 mL LB-Amp medium (Amp concentration was 100 μg/mL). These second cultures were grown at 37° C. at 200 rpm for 30 hours. Additional Amp (to a final concentration of 50 μg of fresh Amp per mL) was added to each flask after 12 hours of growth. After 30 hours, the bacteria were collected by centrifugation at 8,000 g for 10 minutes. The resulting bacterial cell pellets were washed by adding 20 mL of 10 mM Tris-HCL buffer (pH 8.0), resuspending the cells, and re-centrifuging as before. Each cell pellet was then resuspended in 10 mL of water. About 0.5 mL of each extract was used for dry mass analysis and the remaining cell suspensions (about 9.5 mL) were frozen at −20° C. overnight.

[0196] The 9.5 mL cell suspensions were used as follows. First, the cells were thawed on ice and lysed by passing the cell suspensions through a French press three times (14,000 psi pressure). The resulting cell extracts were frozen at −20° C. in 1 mL aliquots and maintained on ice prior to analysis.

[0197] High pressure liquid chromatography was performed using Waters' 2690 Alliance integrated system (Waters Corporation, Milford, Mass.). Prior to analysis, all samples and standards were dissolved in HPLC-grade ethanol, loaded into the built-in auto-sampler, and kept at 5°-10° C. in the dark. The separation was carried out using an isocratic elution program of 70:30 ethanol/methanol (v/v) at a flow rate of 1.0 mL/min. The column was a Waters Nova-Pak C18, 3.9-150 mm equipped with a guard column of the same stationary phase. The injection volume was typically 10-25 μL. Total run time was ten minutes.

[0198] Under these conditions, retention times were 3.1 and 4.9 minutes for CoQ(8) and CoQ(10), respectively. For quantification purposes, a four-point external calibration curve was calculated using freshly prepared CoQ(10) standards. Calibration levels were 1.0, 4.0, 10.0 and 100.0 μg/mL (ppm). Each standard was injected in triplicate, and the resulting calibration plot was linearly fitted with observed r²'s of >0.999.

[0199] For UV and MS detection, a photodiode array (PDA, Model UV6000LP, ThermoQuest Corp., San Jose, Calif.) and an ion trap mass analyzer (LCQ Classic, Finnigan/ThermoQuest Corp., San Jose, Calif.) were connected in series with the chromatograph and without splitting of the effluent. The PDA was operated in scanning mode from 220-300 in. Effluent from the PDA was introduced into the mass analyzer via atmospheric-pressure chemical ionization (APCI) using the following parameters: capillary temperature, 150° C.; capillary voltage, 3 kV; vaporizer temperature, 400° C.; sheath gas (N₂) flow, 80 arbitrary units; auxiliary gas (N₂) flow, 5 arbitrary units; and corona discharge needle, 5 mA/6 kV. Positive-ion detection was performed in full scan (250-1000 m/z), 2 mscans, 500 ms ion injection time.

[0200] Under these conditions, CoQ(8) yielded a mass spectrum with a base peal(at 727.5 m/z, corresponding to the protonated ‘molecular ion’ as well as several satellite ions from ethanol and/or methanol adducts (FIGS. 23 and 24). Similarly, CoQ(10) yielded a mass spectrum with a base peak at 863.6 m/z corresponding to its protonated ‘molecular ion’ (FIG. 25). Several ethanol and/or methanol satellite adducts were observed as well. Both CoQ(8) and CoQ(10) yielded UV spectra with maxima at 274 nm.

[0201] Two samples were analyzed: MG1655 PUC18 and MG1655 PUC18-DDS. MG1655 PUC18 is E. coli strain MG1655 transfected with the PUC18 vector only. MG1655 PUC18-DDS is E. coli strain MG1655 transfected with the PUC18 vector containing nucleic acid that encodes a R. sphaeroides polypeptide having DDS activity. The MG1655 PUC18 specimen contained only CoQ(8) (retention time 3.08 min, FIG. 21) as confirmed by its mass spectrum (FIG. 23), with a base peak at 727.4 ml/z and a UV spectrum with a maximum at 274 nm. The MG1655 PUC18-DDS specimen, however, contained CoQ(8) and CoQ(10) (FIG. 22), both of which were confirmed by matching mass spectra (FIGS. 24 and 25) and UV maxima.

Example 7 Cloning Nucleic Acid that Encodes a Sphingomonas trueperi Polypeptide Having DXR Activity

[0202]Sphingomonas trueperi ATCC 12417 cultures (100-200 mL) were grown in nutrient broth at 30° C. and 250 rpm for 2-3 days. The cells then were pelleted and washed with a 10 mM Tris: 1.0 mM EDTA solution. The pellets were resuspended in 5 mL of GTE buffer (50 mM glucose, 25 mM Tris HCl (pH 8.0), 10 mM EDTA (pH 8.0)) per 100 mL of culture. Lysozyme and Proteinase K were added to a 1 mg/mL concentration and mutaniolysin was added to 5.5 μg/mL. After a 1.5 hour incubation at 37° C., SDS was added to a final concentration of 1%, and the concentration of Proteinase K was brought to 2 mg/mL. After incubation at 50° C. for one hour, an equal volume of GTE buffer was added, and NaCl was added to a 0.15 M concentration. The mixture was extracted with an equal volume of phenol:chloroform:isoamyl alcohol (25:24:1) and centrifuged at 10,000 rpm for 10 minutes. The supernatant was removed to a clean tube, extracted with an equal volume of chloroform, and centrifuged at 5,000 rpm for 10 minutes. The supernatant was treated with RNAse and precipitated with 2.5 volumes of ethanol. The spooled DNA was washed with 70% ethanol, air dried, and resuspended in 10 MM Tris (pH 8.5). After resuspending, the resuspended DNA was further cleaned by re-extraction with phenol:chloroform:isoamyl alcohol and chloroform, and reprecipation with {fraction (1/10)} volume 7.4 M NH₄OAc and 2.5 volumes ethanol.

[0203] A conserved region of the 1-deoxy-D-xylulose 5-phosphate reductoisomerase (dxr) gene was cloned by PCR. Five degenerate forward and five degenerate reverse PCR primers were designed from conserved protein regions that were revealed by aligning known dxr genes (FIG. 27). The degenerate sequences were designed from the conserved regions using the universal codon table. The primers were used in all logical combinations in PCR using Taq polymerase (Roche Molecular Biochemicals, Indianapolis, Ind.) and 1 ng of genomic DNA/μL reaction mix. PCR was conducted using a touchdown PCR program with 4 cycles at an annealing temperature of 59° C., 4 cycles at 57° C., 4 cycles at 55° C., and 24 cycles at 53° C. Each cycle used an initial 30 second denaturing step at 94° C. and a 1.75 minute extension at 72° C., and the program had an initial denaturing step for 2 minutes at 94° C. and final extension of 5 minutes at 72° C. The amounts of PCR primer used in the reaction were increased 3-12 fold above typical PCR amounts depending on the amount of degeneracy in the 3′ end of the primer. In addition, separate PCR reactions containing each individual primer were made to identify PCR products resulting from single degenerate primers. Fifteen IL of each PCR product was separated on a 1.5% TAE (Tris-acetate-EDTA)-agarose gel. Degenerate primers F2 (5′-CCSGTSGAYWSSGARCAYAACGCS-3′ (SEQ ID NO: 132)) and R7 (5′-ATGATGAACAAGGGSCTSGAR-3′ (SEQ ID NO: 133)) produced a band of about 250 bp, which was the expected size based on dxr genes from other species. This band was not present in the individual F2 and R7 primer control reactions. Degenerate primers F3 (5′-CATCCVAACTGGWMVATGGG-3′ (SEQ ID NO:134)) and R2 (5′-ATYGGYRWWCKCATATCMGG-3′ (SEQ ID NO:135)) produced a band of about 200 bp, which also was the expected size. The F2-R7 and F3-R2 fragments were isolated and purified using a QIAquick Gel Extraction Kit (Qiagen Inc., Valencia, Calif.). Three μL of the purified band was ligated into pCR®II-TOPO vector, which was then transformed by a heat-shock method into TOP 10 E. coli cells using a TOPO cloning procedure (Invitrogen, Carlsbad, Calif.). Transformations were plated on LB media containing 100 μg/mL of ampicillin and 50 μg/mL of 5-Bromo-4-Chloro-3-Indolyl-B-D-Galactopyranoside (X-gal). Individual, white colonies were resuspended in about 20 μL of 10 mM Tris and heated for 10 minutes at 95° C. to break open the bacterial cells. To screen individual colonies, 2 μL of the heated cells was used in a 25 μL PCR reaction as described above using the appropriate degenerate primers. Plasmid DNA was obtained with a QIAprep Spin Miniprep Kit (Qiagen, Inc) from cultures of colonies having the desired insert and used for DNA sequencing with M13R and M13F primers. Sequence analysis revealed that the F2-R7 and F3-R2 fragments overlapped and were homologous to known dxr genes.

[0204] Genome walking was performed to obtain the complete coding sequence as follows. The overlapping of the F2-R7 and F3-R2 fragments resulted in a sequence 358 bp in length. The following four primers for conducting genome walking in both upstream and downstream directions were designed using the portion of this sequence that was internal to the degenerate primers: GSP1F 5′-CGAATGGACGACGGATTGGCGATGGAC-3′ (SEQ ID NO:136) GSP2F 5′-TCAGTTCGAGCCCCTTGTTCATCATCGTC- (SEQ ID NO:137) 3′ GSPIR 5′-CGAACTGATCGAAGCCTTCCACCTGTTC-3′ (SEQ ID NO:138) GSP2R 5′-GGTCCATCGCCAATCCGTCGTCCATTC-3′ (SEQ ID NO:139)

[0205] The GSP1F and GSP2F primers faced upstream, the GSP1R and GSP2R primers faced downstream, and the GSP2F and GSP2R primers were nested inside the GSP1F and GSP1 R primers. Genome walking was conducted according to the manual for CLONTECH's Universal Genome Walking Kit (CLONTECH Laboratories, Inc., Palo Alto, Calif.) with the exception that the enzymes FspI and SmaI were used in place of DraI and EcoRV. The DraI and EcoRV enzymes were replaced because they cut S. trueperi genomic DNA too infrequently to give fragment lengths amenable to PCR. The PCR mixture contained 5% DMSO. First round PCR was conducted in a Perkin Elmer 9700 Thermocycler with 7 cycles consisting of 2 seconds at 94° C. and 3 minutes at 72° C., and 36 cycles consisting of 2 seconds at 94° C. and 3 minutes at 66° C., with a final extension at 66° C. for 4 minutes. Second round PCR used 5 cycles consisting of 2 seconds at 94° C. and 3 minutes at 72° C., and 26 cycles consisting of 2 seconds at 94° C. and 3 minutes at 66° C., with a final extension at 66° C. for 4 minutes. Nine μL of the first round product and seven μL of the second round product were separated on a 1.5% TAE-agarose gel. A 1.3 Kb band was obtained from the second round product for the SmaI forward reaction, an 800 bp band for the StuI reverse reaction, and a 750 bp band for the PvuII reverse reaction. These fragments were gel purified, cloned, and sequenced. Internal primers were used to amplify and obtain additional sequence of the gene. Sequence analysis revealed that the sequence derived from genome walking overlapped with the original fragments and contained an entire coding sequence homologous to known dxr genes. The full-length clone containing coding and non-coding sequence was 2017 bp in length (FIG. 28). The open reading frame starting with the first GTG site was 1161 bp in length (FIG. 29), which encoded a polypeptide with 386 amino acid residues (FIG. 30).

Example 8 Making Recombinant Microorganisms

[0206]Rhodobacter sphaeroides (ATCC 35053) was routinely maintained on Luria Bretain (Miller) agar (Fisher scientific) plates. When needed, R. sphaeroides was cultured as follows. A 5 mL culture was grown in a 15 mL culture tube at 30° C. in Innova 4230 Incubator, Shalzer (New Brunswick Scientific, Edison, NJ) with a shaking speed of 250 rpm. Each 5 mL culture was started by inoculating liquid media (Sistrom media supplemented with 20% LB) with a single colony. The liquid media contained the following ingredients per liter: 2.72 g KH₂PO₄, 0.5 g (NH₄)₂SO₄, 0.5 g NaCl, 0.2 g EDTA disodium salt, 0.3 g MgSO_(4·)·7H₂O, 0.033 g CaCl₂·2H₂O, 0.2 mg FeSO₄·7H₂O, 0.02 mL (NH₄)₆Mo₇O₂₄·4H₂O (1% solution), 1 mL Trace element solution, 0.2 mL Vitamin solution, 5 g Luria Bretain Broth Mix, and 8 mL Glucose (50%). The Trace element solution contained the following ingredients per liter: 1.765 g EDTA disodium salt, 10.95 g ZnSO₄·7H₂O, 5 g FeSO_(4·7)H₂O, 1.54 g MnSO₄·H₂O, 0.392 g CuSO₄·5H₂O, 0.284 g Co(NO₃)₂·6H₂O, and 0.114 g H₃BO₃. The Vitamin solution contained the following ingredients per liter: 10 g Nicotinic acid, 5 g Thiamine HCl, and 0.01 g Biotin. The vitamins and glucose were added after the media cooled to room temperature after autoclaving. When necessary, the media was supplemented with one or more of the following antibiotics: Kanamycin (25 μg/mL; final concentration), Spectinomycin (25 μg/mL; final concentration), and/or Streptomycin (25 μg/mL; final concentration).

[0207] Electrocompetent R. sphaeroides Cells

[0208] Electrocompetent R. sphaeroides cells were made as follows. A 5 mL culture of R. sphaeroides was grown overnight at 30° C. in Sistrom's media supplemented with 20% LB. This culture was diluted {fraction (1/100)} in 300 mL of the same media and grown to an OD₆₆₀ of 0.5-0.8. The cells were chilled on ice for 10 minutes and then centrifuged for 6 minutes at 7,500 g. The supernatant was discarded, and the cell pellet was resuspended in ice-cold 10% glycerol at half of the original volume. The cells were pelleted by centrifugation for 6 minutes at 7,500 g. The supernatant was again discarded, and cells resuspended in ice-cold 10% glycerol at one quarter of the original volume. The last centrifugation and resuspension steps were repeated, followed by centrifugation for 6 minutes at 7,500 g. The supernatant was decanted, and the cells resuspended in the small volume of glycerol that did not drain out. Additional ice-cold 10% glycerol was added to resuspend the cells, if necessary. Forty μL of the resuspended cells was used in a test electroporation to determine if the cells needed to be concentrated by centrifugation or diluted with 10% ice-cold glycerol. Time constants of 8.5-9.0 milliseconds resulted in good transformation efficiencies. If cells were too dilute, the time constant was greater than 9.0 and transformation efficiencies were low. If cells were too concentrated, the electroporation would spark. Once an acceptable time constant was achieved, cells were aliquoted into cold microfuge tubes and stored at −80° C. All water used for media and glycerol was 18.2 Mohm-cm or higher.

[0209] Electrocompetent R. sphaeroides cells were electroporated as follows. One μL of plasmid DNA was gently mixed into 40 μL of R. sphaeroides electrocompetent cells, which were then transferred to an electroporation cuvette with a 0.2 cm electrode gap. Electroporations were conducted using a Biorad Gene Pulser II (Biorad, Hercules, Calif.) with settings at 2.5 kV of energy, 400 ohms of resistance, and 25 μF of capacitance. Cells were recovered in 400 μL SOC media at 30° C. for 6-16 hours. The cells were then plated (200 μL per plate) on the appropriate selective media. Transformation efficiencies averaged about 2,000 transformants/μg of DNA.

[0210] Electrocompetent E. coli cells

[0211] Electrocompetent E. coli strain S17-1 cells were made as follows. A 5 mL culture of E. coli strain S17-1 was grown overnight at 30° C. in LB media supplemented with 25 μg/mL of streptomycin and 25 μg/mL of spectinomycin. This culture was diluted {fraction (1/100)} in 300 mL of the same media and grown to an OD₆₆₀ of 0.5-0.8. The cells were chilled on ice for 10 minutes and then centrifuged for 6 minutes at 7,500 g. The supernatant was discarded, and the cell pellet was resuspended in ice-cold 10% glycerol at half of the original volume. The cells were pelleted by centrifugation for 6 minutes at 7,500 g. The supernatant was again discarded, and the cells were resuspended in ice-cold 10% glycerol at one quarter of the original volume. The last centrifugation and resuspension steps were repeated, followed by centrifugation for 6 minutes at 7,500 g. The supernatant was decanted, and the cells resuspended in the small volume of glycerol that did not drain out. Additional ice-cold 10% glycerol was added to resuspend the cells, if necessary. Cells were aliquoted into cold microfuge tubes and stored at −80° C.

[0212] Electrocompetent E. coli strain S17-1 cells were electroporated as follows. Forty μL of competent cells was used per electroporation. Electroporation was conducted using a Biorad Gene Pulser II and a standard E. coli protocol: 2.5 kV of energy, 200 ohms of resistance, and 25 μF of capacitance. Electroporated cells were recovered in 250-1000 μL of SOC media for one hour, and 10-200 μL of culture was plated per plate of selective media. Transformation efficiencies averaged about 1.5×10⁴ transformants/μg of DNA.

[0213] Constructs

[0214] Various clones were overexpressed in R. sphaeroides using the broad-host-range vector pBBR1MCS2 (Kovach et al., Gene, 166:175-176 (1995)) that was engineered to have either an R. sphaeroides rrnb promoter, an R. sphaeroides glnB promoter, or a tet promoter. The pBBR1MCS2 vector is mobilizable and relatively small (5,144 bp), replicates in R. sphaeroides, has a multiple cloning site with lacZα color selection, and carries a kanamycin resistance gene. All restriction enzymes and T4 DNA ligase were obtained from New England Biolabs (Beverly, Mass.) unless otherwise indicated. All plasmid DNA preparations were done using QIAprep Spin Miniprep Kits or Qiagen Maxi Prep Kits, and all gel purifications were done using QIAquick Gel Extraction Kits (Qiagen, Valencia, Calif.).

[0215] pMCS2rrnBP

[0216] The vector designated pMCS2rrnBP, which contains an R. sphaeroides rrnB promoter, was constructed by inserting a copy of the R. sphaeroides rrnB promoter (rrnBP) into the pBBR1MCS2 vector. The rrnB promoter was isolated from the pTEX124 vector (obtained from S. Kaplan) by digestion with the restriction enzyme BamHI, which releases the promoter as a 363 bp fragment. Alternatively, the rrnB promoter can be obtained by PCR amplifying it from R. sphaeroides genomic DNA using primers based on published rrnB sequence (GenBank® accession number X53854). This fragment was gel purified from a 2% Tris-acetate-EDTA (TAE) agarose gel. The pBBR1MCS2 vector was also digested with BamHI, and the enzyme heat inactivated at 80° C. for 20 minutes. The digested vector was then dephosphorylated with shrimp alkaline phosphatase (Roche Moelcular Biochemicals, Indianapolis, Ind.) and gel purified from a 1% TAE-agarose gel. The prepared vector and the rrnBP fragment were ligated using T4 DNA ligase at 16° C. for 16 hours. One ILL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH10B™ cells (Life Technologies, Inc., Rockville, Md.). Electroporated cells were plated on LB media containing 25 μg/mL of kanamycin (LBK). Plasmid DNA was isolated from cultures of single colonies and was digested with HindIII restriction enzyme to confirm the presence of a single insertion of the rrnb promoter. The sequence of the rrnBP inserts for these colonies was also confirmed by DNA sequencing.

[0217] pMCS2glnBP

[0218] The vector designated pMCS2glnBP, which contains an R. sphaeroides glnB promoter, was constructed by inserting a copy of the R. sphaeroides glnB promoter (glnBP) into the pBBR1 MCS2 vector. The glnb promoter was PCR amplified from genomic DNA obtained from R. sphaeroides strain 35053. The following primers were designed based on sequence information obtained from GenBank® accession number X71659: (SEQ ID NO:140) glnBF 5′-ATTATCTAGAATCCGCCCCGCCTCCACCTC-3′ (SEQ ID NO:141) glnBR 5′-GATGGATCCTGGGTAGGGTCGCTGCTGTCC-3′

[0219] The primers introduced an XbaI restriction site at the 5′ end and a BamHI restriction site at the 3′ end. The following reaction mix and PCR program was used to amplify the promoter region of the glnB gene. Reaction Mix PCR program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  7 cycles of: dNTP mix (10 mM)  4 μL 94° C. 30 seconds glnBF (50 μM)  2 μL 61° C. 45 seconds glnBP (50 μM)  2 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: Pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  73 μL 66° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0220] The PCR product was separated on a 1.2% TAE-agarose gel. An about 500 bp fragment was excised and gel purified. The isolated DNA was restricted with XbaI and BamHI, and the resulting digested DNA column purified using a Qiagen gel isolation kit. Three μg of pBBR1MCS2 plasmid DNA was digested with BamHI and XbaI. The digestion was inactivated at 80° C. for 20 minutes. The digested vector was then dephosphorylated with shrimp alkaline phosphatase and gel purified on a 1% TAE-agarose gel. Eighty-six ng of the prepared pBBR1MCS2 vector was ligated with 60 ng of the digested glnBP PCR product using T4 DNA ligase at 14° C. for 14-16 hours. One μL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH10 B™ cells. Electroporated cells were plated on LB media containing 25 μg/mL of kanamycin and 50 μg/mL of Xgal (LBKX). Eight individual, white colonies were selected, and their plasmid DNA isolated using a QIAprep Spin Miniprep Kit. Plasmid DNA isolated from each colony was digested in separation reaction mixtures with PstI and a combination of EcoRI/XbaI. All eight clones had a restriction pattern that indicated the presence of the insert. The sequence of three clones was verified.

[0221] pMCS2tetP

[0222] The vector designated pMCS2tetP, which contains a tet promoter, was constructed by cloning the promoter for the tetracycline resistance determinants from transposon Tn1721 (Waters et al., Nucleic Acids Research, 11(17):6089-6105 (1983)) into the pBBR1MCS2 vector. The tetA gene promoter (tetP) was amplified using plasmid pRK415 as template. The following primers were designed to introduce an XbaI restriction site at the beginning of the amplified fragment and a BamHI site at the end of the amplified fragment. TETXBAF 5′-TTATCTAGAACCGTCTACGCCGACCTC- (SEQ ID NO:142) GTTCAAC-3′ TETBAMR 5′-TTAGGATCCCCTCCGCTGGTCCGATTG- (SEQ ID NO:143) AAC-3′

[0223] The PCR mix contained the following: 1×Native Plus Pfu buffer, 20 ng pRK415 plasmid DNA, 0.2 μM of each primer, 0.2 mM of each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a Perkin Elmer Geneamp PCR System 2400 under the following conditions: an initial denaturation at 94° C. for 1 minute; 8 cycles of 94° C. for 30 seconds, 60° C. for 45 seconds, and 72° C. for 45 seconds; 24 cycles of 94° C. for 30 seconds, 66° C. for 45 seconds, and 72° C. for 45 seconds; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 2%TAE-agarose gel. A 160 bp fragment was excised from the gel and purified. The purified fragment was digested simultaneously with XbaI and BamHI restriction enzymes, and purified with a QIAquick PCR Purification Kit. Three μg of pBBR1MCS2 plasmid DNA was digested with BamHI and XbaI, and the digest was inactivated at 80° C. for 20 minutes. The digested vector was then dephosphorylated with shrimp alkaline phosphatase and gel purified on a 1% TAE-agarose gel.

[0224] 100 ng of the prepared pBBR1MCS2 vector was ligated with 36 ng of the digested tetP PCR product using T4 DNA ligase at 16° C. for 16 hours. One μL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH5α™ cells. Electroporated cells were plated on LB media containing 25 μg/mL of kanamycin and 50 μg/mL of Xgal (LBKX). Individual, white colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBKX. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells. Two μL of the heated cells was used in a 25 μL PCR reaction using the following primers homologous to the vector and flanking the cloning site: MCS2FS 5′-AGGCGATTAAGTTGGGTAAC-3′ (SEQ ID NO:144) MCS2RS 5′-GACCATGATTACGCCAAG-3′ (SEQ ID NO:145)

[0225] The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 32 cycles of 94° C. for 30 seconds, 55° C. for 45 seconds, and 72° C. for 1 minute; and a final extension for 7 minutes at 72° C. All colonies showed a single insertion event. Plasmid DNA was isolated from cultures of two individual colonies and sequenced to confirm the DNA sequence of the tet promoter in the construct.

[0226] pMCS2rrnBP/Stdxs

[0227] The nucleic acid encoding a S. trueperi polypeptide having DXS activity was cloned in the pMCS2rmBP vector as follows. The S. trueperi dxs gene was amplified by PCR using primers homologous to sequence upstream and downstream of the gene. These primers, STDXSMCSF and STDXSMCSR, were designed to introduce a ClaI restriction site at the beginning of the amplified fragment and a KpnI site at the end of the amplified fragment. STDXSMCSF 5′-GATAATCGATGTGTGACTGACCTGT- (SEQ ID NO:146) CCAAC-3′ STDXSMCSR 5′-CTTAGGTACCATGTTGGAGATTCAA- (SEQ ID NO:147) GGTGG-3′

[0228] The PCR mix contained the following: 1×Native Plus Pfu buffer, 200 ng S. trueperi genomic DNA, 0.2 μM of each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase (Stratagene, La Jolla, Calif.) in a final volume of 200 μ. The PCR reaction was performed in a Perkin Elmer Geneamp PCR System 2400 under the following conditions: an initial denaturation at 94° C. for 1 minute; 8 cycles of 94° C. for 30 seconds, 54° C. for 45 seconds, and 72° C. for 3.5 minutes; 27 cycles of 94° C. for 30 seconds, 60° C. for 45 seconds, and 72° C. for 3.5 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1% TAE-agarose gel. A 2.2 Kb fragment was excised from the gel and purified. The purified fragment was digested with ClaI restriction enzyme, purified with a QIAquick PCR Purification Kit, digested with KpnI restriction enzyme, purified again with a QIAquick PCR Purification Kit, and quantified on a minigel.

[0229] Three μg of the pMCS2rrnBP vector was digested with the restriction enzyme ClaI, gel purified on a 1% TAE-agarose gel, digested with KpnI, purified with a QIAquick PCR Purification Kit, dephosphorylated with shrimp alkaline phosphatase, and purified again with a QIAquick PCR Purification Kit. 120 ng of the digested PCR product containing the S. trueperi dxs gene and the 50 ng of the prepared pMCS2rrnBP vector was ligated using T4 DNA ligase at 16° C. for 16 hours. One μL of the ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH10BTM cells. The electroporated cells were plated onto media. Plasmid DNA was isolated from cultures of individual colonies and evaluated for the presence of the desired insert by restriction enzyme analysis with HindIII and SacI enzymes. The sequence of the Stdxs insert was confirmed by DNA sequencing. The resulting plasmid containing the Stdxs sequence under the control of the rrnB promotor was designated pMCS2rrnBP/Stdxs.

[0230] Purified pMCS2rrnBP/Stdxs plasmid DNA derived from a colony having the correct sequence was then electroporated into electrocompetent cells of R. sphaeroides strain 35053. Plasmid DNA was isolated from cultures of individual R. sphaeroides colonies. Restriction patterns of plasmid preparations from R. sphaeroides are difficult to analyze due to the presence of multiple native plasmids in this species. To check the plasmid integrity in R. sphaeroides, one IL of the plasmid preparation from a transformed R. sphaeroides colony was used to re-transform E. coli Electromax™ DH10B™ cells by electroporation. Electroporated cells were plated on LBK media. Plasmid DNA was isolated from cultures of individual colonies and evaluated using SacI and HindIII restriction digests.

[0231] pMCS2rrnBP/Stdxs2

[0232] A second pMCS2rrnBP plasmid containing the nucleic acid encoding a S. trueperi polypeptide having DXS activity was constructed. This construct was made using the following forward primer designed to introduce the ribosomal binding site (rbs) from the R. sphaeroides dxs I gene along with a ClaI restriction site. SXSCLAF2 5′-ACTATCGATGAAGGAAGAGCATGGCTGACCT-ACCCAAGAC-3′ (SEQ ID NO:146)

[0233]S. trueperi genomic DNA was used as template in a PCR mixture using the primers SXSCLAF2 and STDXSMCSR. The PCR program and reaction mixture used were identical to those described for the pMCS2rrnBP/Stdxs construct. The PCR product was gel purified, digested with ClaI, purified with a QIAquick PCR Purification Kit, digested with restriction enzyme KpnI, and purified again with a QIAquick PCR Purification Kit. 150 ng of digested PCR product was ligated into 50 ng of the prepared pMCS2rrnBP vector using T4 DNA ligase at 16° C. for 16 hours. One μL of the ligation reaction was transformed into E. coli Electromax™ DH10BTM cells, and the electroporated cells were plated onto LBK plates. Plasmid DNA was isolated from cultures of individual colonies and evaluated for the presence of the desired insert by restriction enzyme analysis with HindIII and SacI enzymes. The sequence of the dxs insert was confirmed by DNA sequencing. The resulting plasmid containing the Stdxs sequence under the control of the rrnB promotor and having an R. sphaeroides ribosomal binding site was designated pMCS2rrnBP/Stdxs2.

[0234] A confirmed construct was electroporated into R. sphaeroides strain 35053, and the electroporated cells were plated onto LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK media. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and two μL of the heated cells used in a 25 μL PCR reaction using the SXSCLAF2 and STDXSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase (Roche) per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 54° C. for 1 minute, and 72° C. for 3.5 minutes; 24 cycles of 94° C. for 30 seconds, 60° C. 1 minute, and 72° C. for 3.5 minutes; and a final extension for 7 minutes at 72° C.

[0235] pMCS2rrnBP/Rsdds

[0236] The nucleic acid encoding a R. sphaeroides polypeptide having DDS activity was cloned in the pMCS2 rrnBP vector as follows. The R. sphaeroides dds gene was PCR amplified using the following primer pair: (SEQ ID NO:147) RDS18F 5′-ACTAGAATTCCGCAACAGTTCCTTCATGTC-3′ (SEQ ID NO:148) RSDDSMCSR 5′-CTAGATCGATACTTGCGGTCGGACTGATAG-3′

[0237] The forward primer was located upstream of the start codon and introduced an EcoRI restriction site, while the reverse primer was located downstream of the stop codon and introduced a ClaI restriction site. Since the forward primer was located upstream, the R. sphaeroides dds maintained its native ribosomal binding site. The following reaction mix and PCR program were used to amplify the R. sphaeroides dds gene. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  8 cycles of: dNTP mix (10 mM)  4 μL 94° C. 30 seconds RDS18F (50 μM)  2 μL 55° C. 45 seconds RSDDSMCSR (50 μM)  2 μL 72° C 3 minutes Genomic DNA (50 ng/μL)  2 μL 21 cycles of: Pfu enzyme (2.5 U/μL)  1 μL 94° C. 30 seconds DI water  74 μL 61° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0238] The PCR product was separated on a 1% TAE-agarose gel, and an about 1.8 Kb fragment was excised and gel purified. The isolated DNA was restricted with EcoRI and ClaI, and was column purified using a Qiagen gel isolation kit. Three μg of pMCS2rrnBP vector DNA was digested with EcoRI, and the linear DNA was gel isolated using a Qiagen gel isolation kit. The vector was further digested with ClaI, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and purified using a QIAquick PCR Purification Kit. The EcoRI/ClaI-digested R. sphaeroides dds PCR product was ligated into the prepared vector using T4 DNA ligase for 14-16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DH10BTM cells, which were then plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the RDS 18F and RSDDSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 μM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 6 cycles of 94° C. for 30 seconds, 55° C. for 45 seconds, and 72° C. for 3 minutes; 25 cycles of 94° C. for 30 seconds, 61° C. 45 seconds, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. The resulting plasmid containing the Rsdds sequence under the control of the rrnB promotor was designated pMCS2rrnBP/Rsdds.

[0239] The pMCS2rrnBP/Rsdds plasimd was electroporated into E. coli strain S17-1. This strain contains a chromosomal copy of the transacting elements that mobilize oriT-containing plasmids during conjugation with a second bacterial strain. It also carries a gene conferring resistance to the antibiotics streptomycin and spectinomycin.

[0240] Using the S17-1 strain, the pMCS2rrnBP/Rsdds plasmid was transferred to R. sphaeroides 35053 by conjugation. Individual colonies were purified by restreaking on LBK plates. Single colonies were screened by PCR using the RDS 18F and RSDDSMCSR primers to confirm the presence of the insert as described above.

[0241] pMCS2rrnBP/Stdds

[0242] The nucleic acid encoding a S. trueperi polypeptide having DDS activity was cloned in the pMCS2rrnBP vector as follows. The S. trueperi dds gene was PCR amplified using the following primer pair: (SEQ ID NO:149) STDDSMCSF 5′-GTCGCTCGAGATCAGATAATCGTCGCTCAA-3′ (SEQ ID NO:150) STDDSMCSR 5′-ATATGGTACCGACATGGACGAGGAAGACGC-3′

[0243] The forward primer was located upstream of the start codon and introduced a XhoI restriction site, while the reverse primer was located downstream of the stop codon and introduced a KpnI restriction site. Since the forward primer was located upstream, the S. trueperi dds fragment maintained its native ribosomal binding site. The following reaction mix and PCR program were used to amplify the S. trueperi dds gene. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  8 cycles of: dNTP mix (10 mM)  4 μL 94° C. 30 seconds SHDDSMCSF (50 μM)  2 μL 55° C. 45 seconds SHDDSMCSR (50 μM)  2 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 21 cycles of: Pfu enzyme (2.5 U/μL)  1 μL 94° C. 30 seconds DI water  74 μL 61° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0244] The PCR product was separated on a 1% TAE-agarose gel, and an about 1.6 Kb fragment was excised. The DNA was isolated using a Qiagen gel isolation kit. The isolated DNA was restricted with XhoI and KpnI, and was column purified using a Qiagen gel isolation kit. Two μg of pMCS2rrnBP vector DNA was digested with KpnI, and the linear DNA was gel isolated using a Qiagen gel isolation kit. The vector was further digested with XhoI, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and column purified using a Qiagen gel purification kit. The XhoI/KpnI-digested S. trueperi dds PCR product was ligated into the prepared vector using T4 DNA ligase for 14-16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DH 10BTM cells, which were then plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 IL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the SHDDSMCSF and SHDDSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 6 cycles of 94° C. for 30 seconds, 55° C. for 45 seconds, and 72° C. for 3 minutes; 25 cycles of 94° C. for 30 seconds, 61° C. 45 seconds, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. The resulting plasmid containing the Stdds sequence under the control of the rrnB promotor was designated pMCS2rrnBP/Stdds.

[0245] The pMCS2rrnBP/Stdds plasmid was electroporated into E. coli strain S17-1. Using the S17-1 strain, the pMCS2rrnBP/Stdds plasmid was transferred to R. sphaeroides 35053 by conjugation. Individual colonies were purified by restreaking on LBK plates. Single colonies were screened by PCR using the SHDDSMCSF and SHDDSMCSR primers to confirm the presence of the insert as described above.

[0246] pMCS2glnBP/Rsdds

[0247] The nucleic acid encoding a R. sphaeroides polypeptide having DDS activity was cloned in the pMCS2glnBP vector as follows. The R. sphaeroides dds gene was PCR amplified using the following primer pair. RSDDSF 5′-TAGAGAATTCGAAGGAAGAGCATGGGATTGGACG-AGGTTTC-3′ (SEQ ID NO:151) RSDDSR 5′-TACTACTTGTATGTAGGTACCACTTGCGGTCGGAC-TGATAG-3′ (SEQ ID NO:152)

[0248] The forward primer introduced an EcoRI restriction site and a ribosomal binding site that was designed based on R. sphaeroides dxs1 gene. The reverse primer introduced a KpnI restriction site. Following reaction mix and PCR program was used to amplify the R. sphaeroides dds gene. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  7 cycles of: dNTP mix (10 mM)  3 μL 94° C. 30 seconds RSDDSF (100 μM)  1 μL 55° C. 45 seconds RSDDSR (100 μM)  1 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: Pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  76 μL 62° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0249] The PCR product was separated on a 1% TAE-agarose gel, and a fragment about 1.6 Kb in size was excised. The excised DNA was isolated using a Qiagen gel isolation ldt. The isolated DNA was restricted with EcoRI and KpnI and was column purified using a Qiagen gel isolation kit. Three jig of pMCS2glnBP vector DNA was digested with KpnI, and the linear DNA was gel isolated using a Qiagen gel isolation kit. The vector was further digested with EcoRI, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and column purified using a Qiagen gel purification kit. The KpnI/EcoRI-digested R. sphaeroides dds PCR product with the R. sphaeroides dxs I ribosomal binding site described above was ligated into the prepared vector using T4 DNA ligase for 14-16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DH10B™ cells, which were then plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the glnBF and RSDDSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 6 cycles of 94° C. for 30 seconds, 55° C. for 45 seconds, and 72° C. for 3 minutes; 25 cycles of 94° C. for 30 seconds, 62° C. 45 seconds, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. A large scale plasmid preparation was done on a culture of a colony containing the Rsdds PCR product, and the glnBP/Rsdds region was sequenced to confirm the lack of nucleotide errors. The resulting plasmid containing the Rsdds sequence under the control of the glnB promotor was designated pMCS2glnBP/Rsdds.

[0250] The pMCS2glnBP/Rsdds plasmid DNA was electroporated into electrocompetent R. sphaeroides strain 35053 cells as well as electrocompetent carotenoid-deficient mutant cells of 35053 (ATCC 35053/ΔcrtE). Individual colonies of both strains were screened by PCR using the glnBF and RSDDSR primers to confirm the presence of the insert as described above.

[0251] pMCS2glnBP/Stdds

[0252] The nucleic acid encoding a S. trueperi polypeptide having DDS activity was cloned in the pMCS2glnBP vector as follows. The S. trueperi dds gene was PCR amplified using the following primer pair. SHDDSECOVF 5′-GCGTGATATCGAAGGAAGAGCATGAGCGC-AACCGTCCACCG-3′ (SEQ ID NO:153) SHDDSKPNR 5′-ACTGCTAGGGTCCGAGGTACCGACATGGACGA-GGAAGACGC-3′ (SEQ ID NO:154)

[0253] The forward primer introduced an EcoRV restriction site and a ribosomal binding site that was designed based on the R. sphaeroides dxs1 gene. The reverse primer introduced a KpnI restriction site. The following reaction mix and PCR program were used to amplify the S. trueperi dds gene. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  7 cycles of: dNTP mix (10 mM)  3 μL 94° C. 30 seconds SHDDSECOVF (100 μM)  1 μL 58° C. 45 seconds SHDDSKPNR (100 μM)  1 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: Pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  76 μL 65° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0254] The PCR product was separated on a 1% TAE-agarose gel, and a fragment about 1.2 Kb in size was excised. The excised DNA was isolated using a Qiagen gel isolation kit. The isolated DNA was restricted with EcoRV and KpnI and was column purified using a Qiagen gel isolation lit. Three μg of pMCS2glnBP vector DNA was digested with KpnI, and the linear DNA was gel isolated using a Qiagen gel isolation lit. The vector was further digested with EcoRV, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and column purified using a Qiagen gel purification kit. The KpnI/EcoRV-digested S. trueperi dds PCR product with the R. sphaeroides dxs I ribosomal binding site was ligated into the prepared vector using T4 DNA ligase for 14-16 hours at 16° C. One IL of the ligation reaction was transformed into E. coli Electromax™ DH10BTM cells, which were plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the glnBF and RSDDSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 6 cycles of 94° C. for 30 seconds, 58° C. for 45 seconds, and 72° C. for 3 minutes; 25 cycles of 94° C. for 30 seconds, 65° C. 45 seconds, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. A large scale plasmid preparation was done on a culture of a colony containing the Stdds PCR product, and the glnBP/Stdds region was sequenced to confirm the lack of nucleotide errors. The resulting plasmid containing the Stdds sequence under the control of the glnB promotor was designated pMCS2glnBP/Stdds.

[0255] The pMCS2glnBP/Stdds plasmid DNA was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and a carotenoid-deficient mutant of 35053 (ATCC 35053/ΔcrtE). Individual colonies of both strains were screened by PCR using the glnBF and SHDDSKPNR primers to confirm the presence of the insert as described above.

[0256] pMCS2tetP/Stdxs

[0257] The nucleic acid encoding a S. trueperi polypeptide having DXS activity was cloned in the pMCS2tetP vector as follows. The pMCS2tetP plasmid DNA was digested with the restriction enzyme KpnI, cleaned with a QIAquick PCR Purification Kit, and digested with the restriction enzyme ClaI. The enzyme reactions were inactivated by heating at 65° C. for 20 minutes. The digested vector DNA was then dephosphorylated with shrimp alkaline phosphatase and gel purified on a 1% TAE-agarose gel. The KpnI/ClaI-digested S. trueperi dxs PCR product described above with the R. sphaeroides dxs1 ribosomal binding site was ligated into the prepared vector using T4 DNA ligase for 16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DH5α™ cells, which were plated on LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the SXSCLAF2 and SHDXSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 54° C. for 1 minute, and 72° C. for 3.5 minutes; 24 cycles of 94° C. for 30 seconds, 60° C. 1 minute, and 72° C. for 3.5 minutes; and a final extension for 7 minutes at 72° C. A large scale plasmid preparation was done on a culture of a colony containing the S. trueperi dxs PCR product, and the tetP/Stdxs region was sequenced to confirm the lack of nucleotide errors. The resulting plasmid containing the Stdxs sequence under the control of the tet promotor was designated pMCS2tetP/Stdxs.

[0258] Plasmid DNA (pMCS2tetP/Stdxs) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and a carotenoid-deficient mutant of 35053 (ATCC 35053/ΔcrtE). Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and STDXSMCSR primers to confirm the presence of the insert as described above.

[0259] pMCS2tetP/Rsdds

[0260] The nucleic acid encoding a R. sphaeroides polypeptide having DDS activity was cloned in the pMCS2tetP vector as follows. Three μg of plasmid DNA of the pMCS2tetP vector was digested with the restriction enzyme KpnI. The digested DNA was cleaned with a QIAquick PCR Purification Kit and digested with the restriction enzyme EcoRI, after which the enzyme was inactivated by heating at 65° C. for 20 minutes. The digested vector DNA was then dephosphorylated with shrimp alkaline phosphatase and gel purified. Sixty ng of vector DNA was ligated with 120 ng of the KpnI/EcoR I-digested R. sphaeroides dds PCR product described above using T4 DNA ligase at 16° C. for 16 hours. One μL of the ligation reaction was transformed into E. coli Electromax™ DH5α™, which were then plated on LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction using the TETXBAF and RSDDSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 55° C. for 1 minute, and 72° C. for 3 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. Plasmid DNA was isolated for a colony having the desired insert, and the tetP/Rsdds region was sequenced to confirm the lack of nucleotide errors from PCR. The resulting plasmid containing the Rsdds sequence under the control of the tet promotor was designated pMCS2tetP/Rsdds.

[0261] Plasmid DNA (pMCS2tetP/Rsdds) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and the ATCC 35053/ΔcrtE strain. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and RSDDSMCSR primers to confirm the presence of the insert as described above.

[0262] pMCS2tetP/Stdds

[0263] The nucleic acid encoding a S. trueperi polypeptide having DDS activity was cloned in the pMCS2tetP vector as follows. Three μg of pMCS2tetP plasmid DNA was digested with the restriction enzyme KpnI. The digested DNA was gel purified and digested with the restriction enzyme EcoRV. The enzyme was then inactivated by heating at 80° C. for 20 minutes, and the DNA dephosphorylated with shrimp alkaline phosphatase. The dephosphorylated DNA was purified using a QIAquick PCR purification kit. Fifty jig of digested vector DNA was ligated with 150 ng of the KpnI/EcoRV-digested S. trueperi dds PCR product described above using T4 DNA ligase at 16° C. for 16 hours. One μL of the ligation reaction was transformed into E. coli Electromax™ DH10BTM cells, which were then plated on LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction using the TETXBAF and STDDSMCSR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 55° C. for 1 minute, and 72° C. for 3 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. for 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. Plasmid DNA was isolated for a colony having the desired insert and was sequenced in the tetP/Stdds region to confirm the DNA sequence of the insert. The resulting plasmid containing the Stdds sequence under the control of the tet promotor was designated pMCS2tetP/Stdds.

[0264] Plasmid DNA (pMCS2tetP/Stdds) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and the ATCC 35053/ΔcrtE strain. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and STDDSMCSR primers to confirm the presence of the insert as described above.

[0265] pMCS2tetP/Stdxs/Rsdds

[0266] Nucleic acid encoding a S. trueperi polypeptide having DXS activity as well as nucleic acid encoding a R. sphaeroides polypeptide having DDS activity was cloned into the pMCS2tetP vector as follows. A vector containing both the S. trueperi dxs gene and the R. sphaeroides dds gene, each behind a tet promoter, was constructed using the pMCS2tetP/Stdxs construct described above as the starting vector. This vector was digested with restriction enzyme XbaI, cleaned with a QIAquick PCR Purification Kit, and digested with the restriction enzyme Bpu10I (Fermentas, Hanover, MD). The enzyme reaction was inactivated by heating for 20 minutes at 80° C. The digested vector DNA was then dephosphorylated using shrimp alkaline phosphatase and gel purified on a 1% TAE-agarose gel.

[0267] A PCR product containing a tet promoter region followed by a R. sphaeroides dds gene was amplified using the pMCS2tetP/Rsdds construct described above as template. The PCR mix contained the following: IX Native Plus Pfu buffer, 5 ng plasmid template, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 55° C. for 1 minute, and 72° C. for 3 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1% TAE-agarose gel. A 1.6 Kb fragment was excised from the gel and purified. The purified fragment was digested with Bpu10I, cleaned with a QIAquick PCR Purification Kit, digested with Xba I restriction enzyme, purified again with a QIAquick PCR Purification Kit, and quantified on a minigel.

[0268] 60 ng of the prepared pMCS2tetP/Stdxs vector was ligated with 70 ng of the digested tetP/Rsdds PCR product using T4 DNA ligase at 16° C. for 16 hours. One μL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH5α™ cells. Electroporated cells were plated on LBK media. Individual colonies were screened by PCR using the RSDDSMCSF and STDXSMCSR primers, which produced a 4.1 Kb band. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction. The PCR reaction mix contained 0.2 μM each primer, 1×Genome Advantage (Clontech, Palo Alto, Calif.) reaction buffer, 1 M GCMelt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, and IX Genome Advantage Polymerase. The PCR was conducted in a MJ Research PTC100 and consisted of an initial denaturation at 94° C. for 1.5 minutes; 32 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 60° C., and a 6.5 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. A large-scale plasmid prep was done for a colony that had the desired insert, and plasmid DNA was sequenced through the tetP/Rsdds region to confirm the lack of nucleotide errors from PCR. The resulting plasmid containing the Stdxs sequence under the control of the tet promotor and the Rsdds sequence under the control of the tet promotor was designated pMCS2tetP/Stdxs/Rsdds.

[0269] Plasmid DNA (pMCS2tetP/Stdxs/Rsdds) was electroporated into electrocompetent cells of R. sphaeroides strains 35053 and the ATCC 35053/ΔcrtE. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the RSDDSMCSF and STDDSMCSR primers to confirm the presence of the insert as described above.

[0270] pMCS2tetP/Stdxr

[0271] Nucleic acid encoding a S. trueperi polypeptide having DXR activity was cloned into the pMCS2tetP vector as follows. The S. trueperi dxr gene was amplified using genomic DNA as template. The following primers were designed to introduce an EcoRV restriction site and a ribosomal binding based on R. sphaeroides dxs1 gene at the beginning of the amplified fragment and a KpnI site at the end of the amplified fragment. SXRRVF 5′-GATGATATCGAAGGAAGAGCATGGTGAAGCGCGT-CACGGTGT-3′ (SEQ ID NO:155) SXRKZPNR 5′-CAAGAGTCAGAAGGTACCCGCCAGAATGGTGAGC-AGGATG-3′ (SEQ ID NO:156)

[0272] The PCR mix contained the following: 1×Native Plus Pfu buffer, 200 ng genomic DNA, 0.2 μM of each primer, 0.2 μM of each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 59° C. for 1 minute, and 72° C. for 3 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1% TAE-agarose gel. A 1.0 Kb fragment was excised from the gel and purified. The purified fragment was digested simultaneously with EcoRV and KpnI restriction enzymes, purified with a QIAquick PCR Purification Kit, and checked on a minigel.

[0273] Fifty ng of the EcoRV, KpnI-digested pMCS2tetP vector described above for the pMCS2tetP/Stdds construct was ligated with 75 ng of the digested S. trueperi dxr PCR product using T4 DNA ligase at 20° C. for 4 hours. One μL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH10 B™ cells, which were then plated on LBK media. Individual colonies were selected and screened by PCR using the TETXBAF and SXRKPNR primers. The PCR mix contained the following: 1×Taq PCR buffer, 200 ng genomic DNA, 0.2 μM of each primer, 0.2 mM of each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per 25 μL reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 32 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. A large-scale plasmid preparation was done for a colony that had the desired insert, and the tetP/Stdxr region was sequenced to confirm the DNA sequence of the insert. The resulting plasmid containing the Stdxr sequence under the control of the tet promotor was designated pMCS2tetP/Stdxr.

[0274] Plasmid DNA (pMCS2tetP/Stdxr) was electroporated into electrocompetent cells of R. sphaeroides strains 35053 and ATCC 35053/ΔcrtE. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and SXRKPNR primers to confirm the presence of the insert as described above.

[0275] pMCS2tetP/Stdxr/Stdds

[0276] Nucleic acid encoding a S. trueperi polypeptide having DXR activity as well as nucleic acid encoding a S. trueperi polypeptide having DDS activity was cloned into the pMCS2tetP vector as follows. A vector containing both the S. trueperi dxr and dds genes, each behind a tet promoter, was constructed using the pMCS2tetP/Stdds construct described above as the starting vector. This vector was digested with restriction enzyme XbaI, cleaned with a QIAquick PCR Purification Kit, and digested with the restriction enzyme Bpu10I (Fermentas). The enzyme reaction was inactivated by heating for 20 minutes at 80° C. The digested vector DNA was then dephosphorylated with shrimp alkaline phosphatase and gel purified.

[0277] A PCR product containing a tet promoter region followed by a S. trueperi dxr gene was amplified using the pMCS2tetP/Stdxr construct described above as template and primers TETBPUF and SXRXBAR. The SXRXBAR primer, having the following sequence, was designed to introduce an XbaI restriction site on the end of the PCR product. SXRXBAR 5′-CAAGAGTCAGAATCTAGACGCCAGAATGGTGA-GCAGGATG-3′ (SEQ ID NO:157)

[0278] The PCR mix contained the following: 1×Native Plus Pfu buffer, 5 ng plasmid template, 0.2 μM each primer, 0.2 nm each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a MJ Research PTC 100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 59° C. for 1 minute, and 72° C. for 3.5 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 3.5 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1% TAE-agarose gel. A 1.4 Kb fragment was excised from the gel and purified. The purified fragment was digested with Bpu10I, cleaned with a QIAquick PCR Purification Kit, digested with XbaI restriction enzyme, purified again with a QIAquick PCR Purification Kit, and quantified on a minigel.

[0279] Sixty ng of the prepared pMCS2tetP/Stdds vector was ligated with 80 ng of the digested tetP/Stdxr PCR product using T4 DNA ligase at 16° C. for 16 hours. One FL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH10 B™ cells, which were then plated oil LBK media. Individual colonies were screened by PCR using the SXREVF and SDSKPNR primers. Colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK media. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 58° C. for 1 minute, and 72° C. for 4.5 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 4.5 minutes; and a final extension for 7 minutes at 72° C. A large-scale plasmid preparation was done for a colony that had the desired insert, and the tetP/Stdxr region was sequenced to confirm the lack of nucleotide errors from PCR. The resulting plasmid containing the Stdxr sequence under the control of the tet promotor and the Stdds sequence under the control of the tet promotor was designated pMCS2tetP/Stdxr/Stdds.

[0280] Plasmid DNA (pMCS2tetP/Stdxr/Stdds) was electroporated into electrocompetent cells of R. sphaeroides strains 35053 and ATCC 35053/ΔcrtE. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the SXREVF and SDSKPNR primers to confirm the presence of the insert as described above.

[0281] pMCS2tetP/EcUbiC

[0282] Nucleic acid encoding a E. coli polypeptide having chorismate lyase activity was cloned into the pMCS2tetP vector as follows. The E. coli ubiC gene was amplified using genomic DNA from E. coli strain DH10 B as template. The following primers were designed to introduce an EcoRV restriction site and a ribosomal binding site based on R. sphaeroides dxs1 gene at the beginning of the amplified fragment, and a KpnI site at the end of the amplified fragment. UBICRVF 5′-CTAGATATCGGAAGGAAGAGCATGTCACAC-CCCGCGTTA-3′ (SEQ ID NO:158) UBLCKPNR 5′-TCAGGTACCGTGTCGCCACCCACAACGCC-CATAATG-3′ (SEQ ID NO:159)

[0283] The PCR mix contained the following: 1×Native Plus Pfu buffer, 200 ng genomic DNA, 0.2 μM each primer, 0.2 mM each dNTP, and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 57° C. for 1 minute, and 72° C. for 2.5 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 2.5 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1.5% TAE-agarose gel. A 650 bp fragment was excised from the gel and purified. The purified fragment was digested with EcoRV, cleaned with a QIAquick PCR Purification Kit, digested with KpnI restriction enzyme, purified again with a QIAquick PCR Purification Kit, and quantified on a minigel.

[0284] Seventy-five ng of the EcoRV, KpnI-digested pMCS2tetP vector described above for the pMCS2tetP/Stdds construct was ligated with 70 ng of the digested ubiC PCR product using T4 DNA ligase at 16° C. for 16 hours. One μL of ligation reaction was used to electroporate 40 μL of E. coli Electromax™ DH5α™ cells, which were then plated on LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction using the TETXBAF and UBICKPNR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed in a MJ Research PTC 100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 32 cycles of 94° C. for 30 seconds, 62° C. for 1 minute, and 72° C. for 2 minutes; and a final extension for 7 minutes at 72° C. A large-scale plasmid preparation was done for a colony that had the desired insert and the tetP/ubiC region was sequenced to confirm the DNA sequence of the insert. The resulting plasmid containing the UbiC sequence under the control of the tet promotor was designated pMCS2tetP/EcUbiC.

[0285] Plasmid DNA (pMCS2tetP/EcUbiC) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and the ATCC 35053/ΔcrtE strain. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and UBICKPNR primers to confirm the presence of the insert as described above with the addition of 5% DMSO (v/v) to the PCR reaction.

[0286] pMCS2tetP/Stdxs/Rsdds/EcUbiC

[0287] Nucleic acid encoding an S. trueperi polypeptide having DXS activity, nucleic acid encoding an R. sphaeroides polypeptide having DDS activity, and nucleic acid encoding an E. coli polypeptide having chorismate lyase activity was cloned into the pMCS2tetP vector as follows. A vector containing the S. trueperi dxs gene, the R. sphaeroides dds gene, and the E. coli ubiC gene, each behind a tet promoter, was constructed using the pMCS2tetP/Stdxs/Rsdds construct described above as the starting vector. This vector was digested with restriction enzyme KpnI, cleaned with a QIAquick PCR Purification Kit, and digested with the restriction enzyme NsiI. The enzyme reaction was inactivated by heating for 20 minutes at 65° C. The digested vector DNA was then dephosphorylated with shrimp alkaline phosphatase and gel purified.

[0288] A PCR product containing a tet promoter region followed by an E. coli ubiC gene was amplified using the pMCS2tetP/EcUbiC construct described above as template. The following primers were designed to introduce an KpnI restriction site at the beginning of the amplified fragment and an NsiI site at the end of the amplified fragment. TETKPNF 5′-TAGGGTACCACCGTCTACGCCGACCT-CGTTCAAC-3′ (SEQ ID NO:160) UBICNSIR 5′-TGTATGCATGTCGCCACCCACAACGC-CCATAATG-3′ (SEQ ID NO:161)

[0289] The PCR mix contained the following: IX Native Plus Pfu buffer, 5 ng plasmid template, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 10 units of native Pfu DNA polymerase in a final volume of 200 μL. The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 62° C. for 1 minute, and 72° C. for 2.5 minutes; 24 cycles of 94° C. for 30 seconds, 66° C. 1 minute, and 72° C. for 2.5 minutes; and a final extension for 7 minutes at 72° C. The amplification product was then separated by gel electrophoresis using a 1% TAE-agarose gel. An 850 bp fragment was excised from the gel and purified. The purified fragment was digested with the restriction enzyme NsiI, cleaned with a QIAquick PCR Purification Kit, digested with the restriction enzyme KpnI, purified again with a QIAquick PCR Purification Kit, and quantified on a minigel.

[0290] Fifty ng of the prepared pMCS2tetP/Stdxs/Rsdds vector was ligated with 35 ng of the digested tetP/ubiC PCR product using T4 DNA ligase at 16° C. for 16 hours. One μL of ligation reaction was used to electroporate 40 μL of E. Coli Electromax™ DH10 B™ cells, which were then plated on LBK media. Individual colonies were resuspended in about 25 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells used in a 25 μL PCR reaction using the SXSCLAF2 and UBICNSIR primers. The PCR reaction mix contained 1×GC-RICH PCR reaction buffer, 1.0 M GC-RICH resolution solution, 0.2 μM each primer, 0.2 mM each dNTP, and 1 unit of GC-RICH enzyme mix per reaction (Roche). The PCR reaction was performed in a MJ Research PTC100 under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 60° C. for 1 minute, and 72° C. for 5 minutes; 24 cycles of 94° C. for 30 seconds, 64° C. 1 minute, and 72° C. for 5 minutes; and a final extension for 7 minutes at 72° C. A large-scale plasmid preparation was done for a colony that had the desired insert, and plasmid DNA was sequenced through the tetP/ubiC region to confirm the lack of nucleotide errors from PCR. The resulting plasmid containing Stdxs sequence under the control of the tet promotor, the Rsdds sequence under the control of the tet promotor, and the UbiC sequence under the control of the tet promotor was designated pMCS2tetP/Stdxs/Rsdds/EcUbiC.

[0291] Plasmid DNA (pMCS2tetP/Stdxs/Rsdds/EcUbiC) was electroporated into electrocompetent cells of R. sphaeroides strains 35053 and ATCC 35053/ΔcrtE. Individual colonies of both strains, along with an E. coli control, were screened by PCR using the SXSCLAF2 and UBICNSIR primers to confirm the presence of the insert as described above.

[0292] pMCS2tetP/RsLytB

[0293] Nucleic acid encoding a LytB R. sphaeroides polypeptide was cloned into the pMCS2tetP vector as follows. The R. sphaeroides lytB was identified by TBLASTN analysis of its genome using an E. coli lytB sequence as a query. Based on the identified sequence the following primers were designed to PCR amplify the gene: LYTBHINDF 5′-GACGAAGCTTGAAGGAAGAGCATGCCTCCCCTCA-CCCTCTATC-3′ (SEQ ID NO:162) LYTBKPNR 5′-GTCACTGAATGAATGGTACCGCAGCCGAGAACCG-CCAGAAGCC-3′ (SEQ ID NO:163)

[0294] The primers introduced a HindIII restriction site and ribosomal binding site at the 5′ end, and a KpnI restriction site at the 3′ end. The following reaction mix and PCR program were used to amplify the lytB gene. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  7 cycles of: dNTP mix (10 mM)  3 μL 94° C. 30 seconds LYTBHINDF (100 μM)  1 μL 59° C. 45 seconds LYTBKPNR (100 μM)  1 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: Pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  76 μL 66° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0295] The PCR product was run on a 1% TAE-agarose gel, and a fragment about 1.1 Kb in size was excised. The excised DNA was isolated using a Qiagen gel isolation kit. The isolated DNA was restricted with HindIII and KpnI, and was column purified using a Qiagen gel isolation kit. Two μg of pMCS2tetP vector DNA was digested with HindIII, and the linear DNA was gel isolated using a Qiagen gel isolation kit. The vector was further digested with KpnI, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and column purified using a Qiagen gel purification kit. The KpnI/HindIII-digested R. sphaeroides lytB PCR product with the R. sphaeroides dxs I ribosomal binding site described above was ligated into the prepared vector using T4 DNA ligase for 14-16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DH10 B™ cells, which were then plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the LYTBHINDF and LYTBKPNR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 8 cycles of 94° C. for 30 seconds, 59° C. for 1 minute, and 72° C. for 3 minutes; 24 cycles of 94° C. for 30 seconds, 66° C. for 1 minute, and 72° C. for 3 minutes; and a final extension for 7 minutes at 72° C. A large scale plasmid preparation was done on a culture of a colony containing the lytB PCR product, and the tetP/lytB region was sequenced to confirm the lack of nucleotide errors. The resulting plasmid containing the RsLytB sequence under the control of the tet promotor was designated pMCS2tetP/RsLytB.

[0296] Plasmid DNA (pMCS2tetP/RsLytB) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and a carotenoid-deficient mutant of 35053 (ATCC 35053/ΔcrtE). Individual colonies of both strains, along with an E. coli control, were screened by PCR using the TETXBAF and LYTBKPNR primers to confirm the presence of the insert as described above.

[0297] pMCS2tetP/Stdxs/Rsdds/RsLytB

[0298] Nucleic acid encoding an S. trueperi polypeptide having DXS activity, nucleic acid encoding an R. sphaeroides polypeptide having DDS activity, and nucleic acid encoding LytB from R. sphaeroides were cloned into the pMCS2tetP vector as follows. The R. sphaeroides lytB gene was cloned and expressed along with the R. sphaeroides dds and S. trueperi dxs genes. In this triple expression system, each gene was expressed through its own tetP. The R. sphaeroides lytB gene was PCR amplified along with the tetP using the following primers. TETKPNF 5′-TAGGGTACCACCGTCTACGCCGACCTC-GTTGAAC-3′ (SEQ ID NO:164) LYTBNSIR 5′-AGGCAATGCATGCAGCCGAGAACCGCC-AGAAGCC-3′ (SEQ ID NO:165)

[0299] The following PCR mix and program were used to PCR amplify the lytB gene along with the tetP. Reaction Mix Program Pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  7 cycles of: dNTP mix (10 mM)  3 μL 94° C. 30 seconds TETKPNF (100 μM)  1 μL 63° C. 45 seconds LYTBNSIR (100 μM)  1 μL 72° C. 3 minutes pMCS2tetP/lytB (10 ng/μL)  1 μL 25 cycles of: Pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  77 μL 69° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0300] In this PCR reaction, pMCS2tetP/RsLytB plasmid DNA was used as a template. The PCR product was separated on a 1% TAE-agarose gel, and a fragment about 1.4 Kb in size was excised. The excised DNA was isolated using a Qiagen gel isolation kit. The isolated DNA was restricted with NsiI and KpnI, and was column purified using a Qiagen gel isolation kit. Two μg of pMCS2tetP/Stdxs/Rsdds plasmid DNA was digested with NsiI, and the linear DNA was gel isolated using a Qiagen gel isolation kit. The vector was further digested with KpnI, and the DNA was column purified. The double-digested vector was then dephosphorylated with shrimp alkaline phosphatase and column purified using a Qiagen gel purification kit. The KpnI/NsiI-digested PCR product was ligated into the prepared plasmid using T4 DNA ligase for 14-16 hours at 16° C. One μL of the ligation reaction was transformed into E. coli Electromax™ DHI10B™ cells, which were then plated on LBK (25 μg/mL) media. Individual colonies were resuspended in about 25 μL of DI water, and 2 μL of the resuspension was plated on LBK. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the SXSCLAF2 and LYTBNSIR primers. The PCR mix contained the following: 1×Taq PCR buffer, 0.2 μM each primer, 0.2 mM each dNTP, 5% DMSO (v/v), and 1 unit of Taq DNA polymerase per reaction. The PCR reaction was performed under the following conditions: an initial denaturation at 94° C. for 2 minutes; 6 cycles of 94° C. for 30 seconds, 59° C. for 45 sec, and 72° C. for 4 minutes; 25 cycles of 94° C. for 30 seconds, 65° C. for 45 seconds, and 72° C. for 4 minutes; and a final extension for 7 minutes at 72° C. A large scale plasmid preparation was done on a culture of a colony containing the correct insert, and the tetP/lytB region was sequenced to confirm the lack of nucleotide errors. The resulting plasmid containing Stdxs sequence under the control of the tet promotor, the Rsdds sequence under the control of the tet promotor, and the LytB sequence under the control of the tet promotor was designated pMCS2tetP/Stdxs/Rsdds/RsLytB.

[0301] Plasmid DNA (pMCS2tetP/Stdxs/Rsdds/RsLytB) was electroporated into electrocompetent cells of R. sphaeroides strain 35053 and a carotenoid-deficient mutant of 35053 (ATCC 35053/ΔcrtE). Individual colonies of both strains were screened by PCR using the SXSCLAF2 and LYTBNSIR primers to confirm the presence of the insert as described above.

Example 9 Making Recombinant Microorganisms Containing Knock-Outs

[0302] Various nucleic acid sequences within the R. sphaeroides genome were knocked out. All restriction enzymes and T4 DNA ligase were obtained from New England Biolabs (Beverly, Mass.) unless otherwise indicated. All plasmid DNA preparations were done using QIAprep Spin Miniprep Kits or Qiagen Maxi Prep Kits, and all gel purifications were done using QIAquick Gel Extraction Kits (Qiagen, Valencia, Calif.).

[0303] ATCC 35053/ΔcrtE(kan)

[0304]R. sphaeroides cells lacking crtE were made by inserting a kanamycin resistance gene into the crtE sequence as follows. In general, the crtE gene from R. sphaeroides was cloned into a pUC19 vector, and a kanamycin gene (lean) was inserted into the gene to inactivate it. The crtE-kan insert was amplified by PCR and cloned into pSUP203, a mobilizable Co1E1-based plasmid that is not maintained in R. sphaeroides unless it is integrated into a R. sphaeroides replicon. This plasmid was transformed into E. coli strain S17-1, a strain that is able to mobilize oriT-containing plasmids in conjugations with a second bacterial strain. The S17-1 strain was conjugated with R. sphaeroides strain 35053, and colonies were identified in which the crtE-kan insert had replaced the native crtE gene.

[0305] The crtE gene from R. sphaeroides strain 17023 was amplified by PCR using primers designed to introduce an SphI restriction site at the beginning of the amplified fragment and an XbaI restriction site at the end of the amplified fragment. The sequences of the primers were as follows. (SEQ ID NO:166) CRTESPHF 5′-AAGCATGCGAAAAAGTTGACACCTGTGGAGTC-3′ (SEQ ID NO:167) CRTEXBAR 5′-ACTCTAGAAGCACCTGCGAATGGACGAAG-3′

[0306] The fragment amplified included the crtE gene along with 85 nucleotides upstream of the translational start codon and 228 nucleotides downstream of the translational stop codon. The PCR reaction mix contained 0.2 μM each primer, 1×GC Genomic PCR Buffer (Clontech, Palo Alto, Calif.), 1 M GC-Melt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, 1×Advantage-GC Genomic Polymerase Mix, and 1 ng of genomic DNA per μL of reaction mix. The PCR was conducted in a Perkin Elmer Geneamp 2400 and consisted of an initial denaturation at 94° C. for 30 seconds; 35 cycles of a 15 second denaturation at 94° C., a one minute annealing at 55° C., and a 3 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. Fifty μL of PCR product was separated on a 1% Tris-Acetate-EDTA (TAE)-agarose gel. A 1180 bp fragment was gel purified, and the purified DNA was digested with XbaI and SphI restriction enzymes (Promega, Madison, Wis.).

[0307] pUC 19 vector was digested with the restriction enzymes SphI and XbaI, and gel purified on a 1% TAE-agarose gel. Fifty ng of purified vector was ligated with about 150 ng of digested crtE PCR product for 16 hours at 14° C. using T4 DNA ligase (Roche Molecular Biochemicals, Indianapolis, Ind.). One μL of ligation reaction was transformed into ElectroMAX™ DH10B™ cells (Life Technologies, Gaithersburg, Md.), which were then plated on LB media containing 100 μg/mL ampicillin and 50 μg/mL of 5-Bromo-4-Chloro-3-Indolyl-B-D-Galactopyranoside (LBKX). Individual, white colonies were resuspended in about 20 μL of 10 mM Tris, and 2 μL of the resuspension was plated on LBKX media. The remnant resuspension was heated for 10 minutes at 95° C. to break open the bacterial cells, and 2 μL of the heated cells was used in a 25 μL PCR reaction using the CRTESPHF and CRTEXBAR primers. The PCR reaction mix contained 0.2 μM each primer, IX GC Genomic PCR Buffer, 1 M GCMelt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, and 1×Advantage-GC Genomic Polymerase Mix. The PCR was conducted in a Perkin Elmer Geneamp 2400 and consisted of an initial denaturation at 94° C. for 30 seconds; 35 cycles of a 15 second denaturation at 94° C., a one minute annealing at 55° C., and a 3 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. Plasmid DNA was isolated for colonies having a crtE gene insert and was digested with the restriction enzyme HindIII and with a mixture of SphI and XbaI to confirm vector structure.

[0308] One μg of the pUC 19crtE construct was digested with XhoI and StuI restriction enzymes. These enzymes cut a 273 bp fragment of DNA from the center of the crtE gene. The digested DNA was separated on a 1% TAE-agarose gel. A 3.6 Kb fragment representing pUC 19 and the remaining ends of the crtE gene was excised and purified.

[0309] The kanamycin resistance gene was amplified by PCR from the PCRII vector (Invitrogen, Carlsbad, Calif.) using primers designed to introduce an StuI restriction site at the beginning of the amplified fragment and an XhoI restriction site at the end of the amplified fragment. The sequences of the primers were as follows. (SEQ ID NO:168) KANSTUF 5′-ATAAAGGCCTTACATGGCGATAGCTAGACTG-3′ (SEQ ID NO:169) KANXHOR 5′-AAGGCTCGAGAAGGATCTTACCGCTGTTGAG-3′

[0310] The PCR reaction mix contained 0.2 μM each primer, 1×Pfu reaction buffer (Stratagene, La Jolla, Calif.), 0.2 mM each dNTP, 8 units Pfu, and 5 ng of the PCRII vector in a 200 μL reaction. The PCR was conducted in a Perkin Elmer Geneamp 2400 and consisted of an initial denaturation at 94° C. for 2 minutes; 8 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 55° C., and a 2.5 minute extension at 72° C.; 24 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 55° C., and a 2.5 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. The PCR product was separated on a 1% TAE-agarose gel, and a 1.2 Kb fragment was excised and purified. One μg of purified DNA was digested with XhoI and StuI restriction enzymes and cleaned using a QIAquick PCR Purification Kit.

[0311] Fifty ng of the digested pUC19crtE vector DNA was ligated with 75 ng of the digested kan PCR product for 16 hours at 14° C. using T4 DNA ligase (Roche). One μL of ligation mix was electroporated into 40 μL of E. coli ElectroMAX™ DH10B™ electrocompetent cells, which were then plated on LB media containing 100)1 g/mL ampicillin and 50 μg/mL kanamycin (LBAK). Plasmid DNA was isolated from cultures of individual colonies and was digested in separate reactions with the restriction enzymes PstI, SphI, and a StuI/XbaI mixture to confirm correct vector structure.

[0312] The crtE gene with the inserted kan gene was amplified by PCR using primers designed to have ScaI restriction sites on both ends of the fragment. The sequences of the primers were as follows. (SEQ ID NO:170) CRTESCAF 5′-ATAGTACTGAAAAAGTTGACACCTGTGGAGTC-3′ (SEQ ID NO:171) CRTESCAR 5′-ATAGTACTAGCACCTGCGAATGGACGAAG-3′

[0313] The PCR reaction mix contained 0.2 μM each primer, 1×GC Genomic PCR Buffer, 1 M GCMelt, 1.1 nM Mg(OAc)₂, 0.2 mM each dNTP, 1×Advantage-GC Genomic Polymerase Mix, and 1 ng of plasmid DNA per μL of reaction mix. The PCR was conducted in a Perkin Elmer Geneamp 9600 and consisted of an initial denaturation at 94° C. for 1 minute; 8 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 55° C., and a 4 minute extension at 72° C.; 25 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 60° C., and a 4 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. 200 μL of PCR product was separated on a 1% TAE-agarose gel. A 2.0 Kb fragment was excised and purified. One μg of purified DNA was digested with ScaI restriction enzyme, and the digested DNA was purified using a QIAquick PCR Purification Kit.

[0314] 2.3 μg of pSUP203 plasmid DNA was digested with Seal restriction enzyme. The digested DNA was separated on a 1% TAE-agarose gel, and a 7.6 Kb fragment was excised and purified. The purified plasmid DNA was then dephosphorylated using calf intestinal alkaline phosphatase (Promega). 75 ng of dephosphorylated plasmid DNA was ligated with 60 ng and 120 ng of the ScaI-digested crtE-kan PCR product for 16 hours at 14° C. using T4 DNA ligase (New England BioLabs). One μL of ligation mix was electroporated into 40 μL of E. Coli ElectroMAX™ DH 10B™ electrocompetent cells, which were then plated on LB media containing 10 μg/mL tetracycline, to which pSUP203 carries a resistance gene, and 25 μg/mL kanamycin. Plasmid DNA was isolated from cultures of individual colonies and digested with ScaI restriction enzyme to check insert size. 100 ng of plasmid DNA derived from a confirmed colony was electroporated into electrocompetent cells of the E. coli strain S17-1. This strain contains a chromosomal copy of the trans-acting elements that mobilize oriT-containing plasmids during conjugation with a second bacterial strain. It also carries a gene conferring resistance to the antibiotics streptomycin and spectinomycin. The transformation reaction was plated on LB media with 10 μg/nL tetracycline, 25 μg/mL kanamycin, and 25 μg/mL streptomycin. Individual colonies were resuspended in about 20 μL of 10 mM Tris and heated for 10 minutes at 95° C. to break open the bacterial cells. Two μL of the heated cells was used in a 25 mL PCR reaction using the CRTESCAF and CRTESCAR primers to confirm the presence of the crtE-kan insert. The PCR reaction mix contained 0.2 μM each primer, 1×GC Genomic PCR Buffer, 1.0 M GCMelt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, and 1×Advantage-GC Genomic Polymerase Mix. The PCR was conducted in a Perkin Elmer Geneamp 9600 and consisted of an initial denaturation at 94° C. for 1 minute; 30 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 56° C., and a 4 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes.

[0315] The pSUP203crtE-kan construct was introduced into R. sphaeroides strain 35053 through conjugation with the E. coli S17-1 strain carrying this vector. The S17-1 donor was grown in LB media with 25 μg/mL kanamycin and 25 μg/mL streptomycin at 37° C. for 16 hours. A growing culture of R. sphaeroides strain 35053 was used to inoculate Sistrom's media using ⅕ to {fraction (1/10)} dilutions, and the subcultures were grown at 30° C. for about 20 hours. For both the S17-1crtE-kan and 35053 genotypes, cells were pelleted from 1.5 mL of culture. Pellets were resuspended and pelleted four times in either 1×Sistrom's salts for the 35053 cells or LB media for the S17-1 cells. The pellets were each resuspended in 1.5 mL of LB, and 200 μL of the S17-1 cells was combined with 1.3 mL of the 35053 cells. This mixture was pelleted, the supernatant removed, and the pellet resuspended in 20 μL of LB media. The resuspended cells were spotted onto an LB plate and incubated at 30° C. for 7.5 hours. The cells were then scraped off the plate, resuspended in 1.5 mL of 1×Sistrom's salts, and plated (200 μL/plate) on Sistrom's media supplemented with 25 μg/mL kanamycin and 10 μg/mL of telluride (SisKTell). The telluride retards the growth of E. coli cells but is detoxified by R. sphaeroides. After 7 days, small black colonies were picked off the plates and streaked to fresh plates of the same media. After 6 days of growth, grayish colonies were patched to LB plates containing 25 μg/mL kanamycin (LBK25) and also to LB plates containing 0.75 μg/mL tetracycline. Desirable double-crossover events, in which the crtE-kan gene was integrated and retained in the genome while the vector DNA was lost, exhibited kanamycin resistance but lacked tetracycline resistance. Colonies resulting from undesirable single-crossover events demonstrated both kanamycin and tetracycline resistance.

[0316] The mutants were confirmed using PCR and Southern hybridization as follows. Colonies that exhibited kanamycin resistance, lacked tetracycline resistance, and had a gray phenotype were screened by PCR for the crtE locus using the CRTESCAF and CRTESCAR primers as described above. To confirm that they were R. sphaeroides colonies with a truncated crtE gene rather than E. coli colonies carrying the vector, colonies were also screened using primers specific to the R. sphaeroides ppsR gene and the E. coli dxs gene. Individual colonies were resuspended in about 20 μL of 10 mM Tris, and heated for 10 minutes at 95° C. to break open the bacterial cells. Two μL of the heated cells were used per 25 μL PCR reaction. The PCR reaction mix contained 0.2 μM each primer, 1×GC Genomic PCR Buffer, 1.0 M GCMelt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, and 1×Advantage-GC Genomic Polymerase Mix. The PCR was conducted in a Perkin Elmer Geneamp 9600 and consisted of an initial denaturation at 94° C. for 1 minute; 8 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 55° C., and a 3.5 minute extension at 72° C.; 22 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 61° C., and a 3.5 minute extension at 72° C.; followed by a final extension at 72° C. for 5 minutes. All suspected 35053crtE-kan colonies produced a crtE band the same size as the S17-1 crtE-kan control. They all also produced a band of the expected size for the ppsR gene and did not produce a band for the E. coli dxs gene.

[0317] To further confirm the presence of double-crossover events, Southern hybridization was conducted on eight 35053crtE-kan colonies as well as R. sphaeroides strains 35053 and 17023. Sequence data for the photosynthetic operon of strain 17023 is available in Genbank and was used to determine restriction enzymes likely to have hybridization patterns that would distinguish mutants from non-mutants. Genomic DNA was isolated from each line using a Gentra Puregene DNA Isolation Kit (Gentra, Minneapolis, Minn.). Two μg of genomic DNA was used in digests with the restriction enzymes ApaI and XhoI. The digests were separated on a 0.8% TAE agarose gel, and the DNA transferred to a nylon membrane. DIG-labeled molecular weight markers II and III (Roche) were also included on the gel/membrane. DIG-labeled probes of the crtE locus were synthesized using a PCR DIG Probe Synthesis Kit (Roche). After baling, membranes were prehybridized in EasyHyb Buffer (Roche) for at least 2 hours and hybridized overnight using 400 nL of a 0.5 DIG labeling reaction per mL of hybridization solution. Detection was conducted using a Wash and Block Buffer Set (Roche). Membranes were washed two times for 5-10 minutes each at room temperature in 2×SSC/0.1% SDS and two times for 15-20 minutes each at 68° C. in 0.1×SSC/0.1% SDS. They were then covered with blocking buffer and placed on a shaker for an hour at room temperature. The blocking buffer was replaced with fresh blocking buffer containing 150 mU of AP conjugate per mL of buffer, and the membranes shaken at room temperature for an additional 30 minutes. Membranes were then washed twice for 15 minutes each at room temperature with washing buffer, followed by a five minute wash with detection buffer. The detection buffer was replaced with fresh detection buffer containing 20 μL of NBT/BCIP solution per mL of buffer. This was placed in the dark at room temperature with no shaking until color developed, after which the buffer was replaced with 10 mM Tris-1 mM EDTA solution.

[0318] In the ApaI digest, the mutant lines exhibited a band of about 850 bp larger than the strain 35053 control, which is the size difference expected from the insertion of the kanamycin gene product in the StuI/XhoI sites. For the XhoI digest, strain 35053 exhibited a band of about 700 bp, strain 17023 had a band of about 1100 bp, mutant 7C had a band of 1550 bp, and the remaining mutants had a band of 2050 bp. The reason for the size difference in the XhoI bands for the mutants was unclear, but mutant 7C was used in further studies due to its possession of the expected band size relative to strain 35053. The resulting R. sphaeroides mutant containing a crtE knockout was designated ATCC 35053/ΔcrtE(kan).

[0319] ATCC 35053/ΔcrtE

[0320]R. sphaeroides cells lacking crtE were made using sacB selection as follows. A truncated crtE gene was cloned into the vector pL01, which is a suicide vector in R. sphaeroides. The pL01 vector carries a kanamycin resistance gene, a B. subtilis sacB gene, an oriT sequence, a CoIEI replicon, and a multiple cloning site (Lenz et al., J. Bacteriol., 176(14):4385-93 (1994)). The pL01crtE plasmid was introduced into R. sphaeroides strain 35053 through conjugation with an E. coli donor. The kanamycin resistance gene was used to select for single-crossover events between the truncated crtE gene and the genomic crtE gene that resulted in incorporation of the pL01crtE DNA into the genome. The presence of the sacB gene on the vector allowed for subsequent selection for the loss of the vector DNA from the genome, as expression of this gene in the presence of sucrose is lethal to E. coli and to R. sphaeroides under certain growth conditions. A portion of the double-crossover events that led to loss of the sacB gene contained the truncated crtE allele. This method of gene knockout is useful because no residual antibiotic resistance gene is left in the genome.

[0321] A three-step PCR process was used to create a 249 bp in-frame deletion in the crtE gene. The crtE gene from R. sphaeroides strain 35053 was amplified by PCR using primers designed to introduce an SphI restriction site at the beginning of the amplified fragment and a SacI restriction site at the end of the amplified fragment. The sequences of the primers were as follows. CRTESPHF 5′-CGTGGCATGCGTGTAAGAAAAAGTTGACA-CCTGTGGAGTC-3′ (SEQ ID NO:172) CRTESACR 5′-CTAAGAGCTCAGTTCGGGCTCGGTCTCGC-CTTTCAGGAAG-3′ (SEQ ID NO:173)

[0322] The PCR reaction mix contained 0.2 μM each primer, 1×Genome Advantage reaction buffer, 1 M GCMelt, 1.1 mM Mg(OAc)₂, 0.2 mM each dNTP, 1×Genome Advantage Polymerase, and 1 ng of genomic DNA per μL of reaction mix. The PCR was conducted in a Perkin Elmer Geneamp 2400 and consisted of an initial denaturation at 94° C. for 2 minutes; 32 cycles of a 30 second denaturation at 94° C., a 45 second annealing at 64° C., and a 3 minute extension at 72° C., followed by a final extension at 72° C. for 7 minutes. 200 μL of PCR product was separated on a 1% TAE-agarose gel, and a 1.5 Kb fragment was excised and purified.

[0323] The second round of PCR consisted of two separate reactions: reaction A, which used primers CRTESPHF and CRTERI, and reaction B, which used primers CRTESACR and CRTEFI. The sequences of primers CRTEFI and CRTERI were as follows. CRTEFI 5′-GAGAGCGAGAGCCAGATCAAGAAGSGGCTGAAGGACATCC-3′ (SEQ ID NO:174) CRTERI 5′-GGATGTCCTTCAGCCSCTTCTTGATCTGGCTCTCGCTCTC-3′ (SEQ ID NO:175)

[0324] The 20 nucleotides on the 3′ ends of this pair of primers are located near the center of the crtE gene, 249 bases apart from each other and facing towards the start (CRTERI) and end (CRTEFI) of the gene. The 20 bp on the 5′ ends of these primers are the reverse complement of the 3′ end of the other primer in the pair. PCR of the two separate reactions was conducted as in the first round, with the exception that 0.05 ng of first round product per μL of reaction mix was used as template. Also, the thermocycler program used a 2 minute initial denaturation at 94° C.; eight cycles of a 30 second denaturation at 94° C., a 45 second annealing at 56° C., and a 3 minute extension at 72° C., followed by eight cycles of a 30 second denaturation at 94° C., a 45 second annealing at 60° C., and a 3 minute extension at 72° C.; followed by 16 cycles of a 30 second denaturation at 94° C., a 45 second annealing at 64° C., and a 3 minute extension at 72° C.; followed by a final extension at 72° C. for 7 minutes. Both PCR products, about 590 and 650 bp in length, were separated on a 1% TAE-agarose gel, excised, and gel purified.

[0325] The third round of PCR used the same primers and reaction mixture as the first round of PCR with the exception that a mixture of 10 ng of each second round fragment was used as template rather than genomic DNA (200 μL reaction). The PCR program used was also the same as that used in the first round of PCR with the annealing time lengthened to 1.5 minutes. The 1.2 Kb third-round product was separated on a 1% TAE-agarose gel and purified. Three μg of purified DNA was digested with the restriction enzymes SacI and SphI. The digested DNA was cleaned using a QIAquick PCR Purification Kit and digested with the restriction enzyme StuI. StuI cut within the deleted region and ensured that there was little or no remaining full-length product. The digestion mixture was again cleaned using a QIAquick PCR Purification Kit.

[0326] Three μg of the vector pL01 was digested with the restriction enzymes SphI and SacI. The enzymes were inactivated by heating to 65° C. for 20 minutes, and the vector was dephosphorylated using shrimp alkaline phosphatase (Roche). The dephosphorylated vector DNA was gel purified on a 1% TAE-agarose gel.

[0327] Sixty-six ng of digested vector DNA was ligated with 80 ng of the digested third-round PCR product at 16° C. for 16 hours using T4 DNA ligase (Roche). One μL of ligation mix was electroporated into 40 μL of E. coli ElectroMAX™ DH5α™ electrocompetent cells (Life Technologies), which were then plated on LB media containing 50 μg/mL kanamycin (LBK50). Plasmid DNA was isolated from cultures of individual colonies and digested with the restriction enzyme SacI and with a mixture of SphI and SadI to confirm correct vector structure.

[0328] One μL of plasmid DNA was used to transform electrocompetent cells of the previously described E. coli strain S17-1. The electroporated cells were plated on LB media containing 25 μg/mL of kanamycin, 25 μg/mL of streptomycin, and 25 μg/mL of spectinomycin (LBKSMST). Single colonies were used to start cultures for plasmid DNA isolation and used in conjugation. These colonies were also plated on LB media containing 5% sucrose and 25 μg/mL of kanamycin to ensure that the sacB gene was still functional. Only colonies which exhibited lethality on the sucrose media were used in conjugation. The presence of the correct insert size was confirmed by digestion of plasmid DNA with the restriction enzymes SacI and SphI.

[0329] Growing cultures of R. sphaeroides strain 35053 were sub-cultured, using ⅕ and {fraction (1/10)} volumes of inoculum, in 5 mL Sistrom's media supplemented with 20% LB and grown at 30° C. for 12 hours. The S17-1 donor colonies were grown in LBKSMST media at 37° C. for 12 hours. 1.5-3.0 mL of each culture was pelleted, and the pellets were washed four times with LB media. Relative pellet size was estimated and about 2 volumes of 35053 cells were used to 1 volume of S17-1 cells. The cell mixture was pelleted, resuspended in 20 μL of LB media, spotted on an LB plate, and incubated at 30° C. for 7-15 hours. The cells were then scraped off the surface of the plate and resuspended in 1.5 mL of Sistrom's salts. 200 μL of resuspended cells were plated on each of seven plates of SisKTell media.

[0330] Colonies that grew on the plates after about 10 days, representing proposed single-crossover events, were streaked to new plates of the same media. Upon growth, single colonies were streaked out on LBK25 media. Purified colonies were patched to Sistrom's media supplemented with 1×LB, 15% sucrose, 0.5% DMSO (v/v), and 25 μg/mL kanamycin (SisLBK15%SucDMSO). These were grown in an anaerobic chamber (Becton Dickinson, Sparks, Md.) at 30° C. for 5 days to check for lethality of the sacB gene in the proposed single-crossover events. Concurrently, the cultures were patched to SisLB media containing 15% sucrose and 0.5% DMSO (v/v) without kanamycin (SisLB15%SucDMSO). Several of the cultures exhibited both white and red colonies upon growth on this media. Whitish-gray colonies were purified from these cultures and tested by PCR to show that they contained the truncated crtE allele. These colonies were also screened using primers specific to the R. sphaeroides ppsR gene and the E. coli dxs gene as described above. Potential double crossovers were also streaked on LBK25 plates to confirm that they were now sensitive to kanamycin. The resulting R. sphaeroides mutant containing a crtE knockout was designated ATCC 35053/ΔcrtE.

[0331] Several discoveries were made using the sacB method to knockout nucleic acid sequenced within the R. sphaeroides genome. First, it was discovered that the cultures used in conjugations, particularly those of the recipient R. sphaeroides strain, should be in exponential growth. Second, it was discovered that when using the S17-1 strain as a vector donor, the use of telluride in the plating medium is unnecessary as this strain is a proline auxotroph and will not grow on Sistrom's media without LB supplementation. Third, it was discovered that potential single crossovers should be screened using two separate PCR reactions. The first reaction should use a primer within the gene of interest together with a primer homologous to upstream sequence. The second reaction should use a primer within the gene of interest together with a primer homologous to downstream sequence. One of these two reactions should produce a truncated fragment. Fourth, it was discovered that single crossovers that have been confirmed to have sacB lethality can be grown aerobically in Sistrom's media for 2 days and then plated on SisLB15%SucDMSO media. The volume plated varies depending on the rate of growth of the strain, but is about one μL or less for strain 35053. This is then grown anaerobically for about 5 days. Fifth, it was discovered that the sacB gene may not completely kill cells with the gene, so there may be a background level of very small colonies. The desired double-crossover colonies, however, are typically larger. These colonies should be purified and screened by PCR to identify whether they contain the truncated or full-length allele. Sixth, it was discovered that using one primer homologous to sequence upstream of the knockout gene and one primer homologous to sequence downstream of the gene is useful in confirming the correct location of the insertion event in addition to determining the allele that is present.

[0332] ATCC 35053/ΔppsR(strep)

[0333]R. sphaeroides cells lacking PPSR were made by inserting a spectinomycin/streptomycin resistance gene into the ppsR sequence as follows. To PCR amplify the ppsR gene from R. sphaeroides strain 17023, the following primers were designed based on published sequence (GenBank Accession Number L19596). (SEQ ID NO:176) PPSRF2 5′-AGTCAGTACTAACTGGTGAAGACGCTGAAG-3′ (SEQ ID NO:177) PPSRR2 5′-GATCAGTACTGTGAACGAATACGATACGCA-3′

[0334] Each primer contained a ScaI restriction site. The ppsR gene was amplified using following reaction mix and PCR amplification program. Reaction Mix Program pfu 10X buffer  10 μL 94° C. 5 minutes DMSO  5 μL  8 cycles of: dNTP mix (10 mM)  8 μL 94° C. 45 seconds PPSRF2 (50 μM)  2 μL 54° C. 45 seconds PPSRR2 (50 μM)  2 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: pfu enzyme (2.5 U/μL)  2 μL 94° C. 45 seconds DI water  69 μL 61° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 10 minutes  4° C. Until used further

[0335] The PCR product was separated on a 0.8% TAE agarose gel, and a band of about 1.8 Kb was cut and gel isolated using Qiagen Gel Isolation kit (Qiagen, Valencia, Calif.). The gel isolated DNA was digested with Seal (New England BioLabs, Beverly, Mass.) for 5 hours. The digested DNA was column purified using Qiagen Gel Isolation kit. The cut DNA was ligated into vector pSUP203 that was also digested with ScaI enzyme.

[0336] 2.3 μg of pSUP203 plasmid DNA was digested for 4 hours at 37° C. with ScaI restriction enzyme. The digested DNA was separated on a 1% TAE agarose gel. A 7.6 Kb fragment was excised and purified. The purified plasmid DNA was then dephosphorylated using calf intestinal phosphatase (New England Biolabs). 100 ng of dephosphorylated plasmid DNA was ligated with 200 ng of the Seal-digested PpsR DNA for 16 hours at 14° C. using T4 DNA ligase (New England BioLabs). One μL of ligation mix was electroporated into 40 μL of E. coli ElectroMAX™ DH5α™ (Life Technologies, Gaithersburg, Md.) electrocompetent cells, which were then recovered in 1 mL of SOC media for one hour at 37° C. and plated on LB media containing 15 μg/mL tetracycline. Plasmid DNA was isolated from 8 individual colonies using Qiagen spin Mini prep kit and digested with Seal restriction enzyme to check insert size. Four of the colonies had a correct insert. 1.5 μg of the plasmid DNA obtained from confirmed colony was digested with XhoI restriction enzyme (New England BioLabs, Beverly, Mass.). This enzyme has a single restriction site in the open reading frame of ppsR gene. A linear DNA band of about 8.4 Kb was gel isolated using a Qiagen Gel isolation kit. A spectinomycin/streptomycin resistance omega cassette was obtained by digesting plasmid pUI1638 (Obtained from Dr. Samuel Kaplan's laboratory) with XhoI enzyme. The digest was separated on a 0.8% TAE agarose gel, and a DNA band of about 2.1 Kb was gel isolated. This DNA which encoded for spectinomycin/streptomycin resistance gene was ligated to pSUP203/PpsR, which was also restricted with XhoI enzyme. One μL of ligation mix was electroporated into 40 μL of E. coli ElectroMAX™ DH5α™ (Life Technologies, Gaithersburg, Md.) electrocompetent cells, which were then recovered in 1 nL of SOC media for one hour at 37° C. and plated on LB media with 15 μg/mL tetracycline, 25 μg/mL spectionomycin, and 25 μg/mL streptomycin. Plasmid DNA was isolated from 10 individual colonies using Qiagen spin Mini prep kit and digested separately with Seal and XhoI restriction enzyme to check insert size. Five of the colonies had a correct insert. 100 ng of plasmid DNA from a confirmed colony was electroporated into electrocompetent cells of the E. coli strain SM10. This strain contains a chromosomal copy of the trans-acting elements that mobilize oriT-containing plasmids during conjugation with a second bacterial strain. It also carries a gene conferring resistance to the antibiotic kanamycin. The transformation reaction was recovered in 1 mL of SOC media for one hour and plated on LB media with 10 μg/mL tetracycline, 25 μg/mL kanamycin, 25 μg/mL of streptomycin, and 25 μg/mL spectinomycin.

[0337] The pSUP203/ppsR-SM-ST construct was conjugated from the E. coli SM10 host into R. sphaeroides strain 35053. The SM10 donor was grown in LB media with 25 μg/mL kanamycin, 25 μg/mL streptomycin, and 25 μg/mL spectinomycin at 37° C. for 16 hours. A growing culture of R. sphaeroides strain 35053 was used to inoculate Sistrom's media in ⅕ to {fraction (1/10)} dilutions. These cultures were grown for about 20 hours. Cells were pelleted for 1.5 mL of culture of both the SM10pSUP203/PpsR-SM-ST and 35053 genotypes. Pellets were washed four times in Sistrom's media without vitamins and glucose. The pellets were each resuspended in 1.5 mL of Sistrom's media without vitamins and glucose. 200 μL of the SM10pSUP203/PpsR-SM-ST cells were combined with 1.3 mL of the 35053 cells. This mixture was pelleted, the supernatant was removed, and the pellet was resuspended in 20 μL of LB media. The resuspended cells were spotted onto a LB plate that was then incubated at 30° C. for 7 hours. The cells were then scrapped off the LB plate, resuspended in 1.5 mL of 1×Sistrom's media without vitamins and glucose, and plated (200 μL/plate) on Sistrom's media supplemented with 25 μg/mL spectinomycin, 25 μg/mL streptomycin, and 10 μg/mL of telluride. The telluride retards the growth of E. coli cells but is detoxified by R. sphaeroides. After 7-10 days, small black colonies were picked off the plates and streaked to fresh plates of the same media. After 6 days of growth, colonies were patched to LB plates containing 25 μg/mL spectinomycin and 25 μg/mL streptomycin (LBSMST25), and also to LB plates containing 0.75 μg/mL tetracycline. Desirable double-crossover events, in which the PpsR-SM-ST gene is retained in the genome and the vector DNA is lost, would have spectinomycin/streptomycin resistance but lack tetracycline resistance. Colonies resulting from undesirable single-crossover events would demonstrate resistance to all of these antibiotic markers.

[0338] Colonies that exhibited only spectinomycin/streptomycin resistance and displayed deep red color were confirmed for double-crossover by Southern hybridization. Southern hybridization was conducted on nineteen potential 35053/PpsR-SM-ST colonies in addition to 35053 and R. sphaeroides strain 17023. Sequence data for the photosynthetic operon of 17023 is available in GenBank and was used to determine restriction enzymes likely to have hybridization patterns that would distinguish mutants from non-mutants. Genomic DNA was isolated from each line using a Gentra Puregene DNA Isolation Kit (Gentra, Minneapolis, Minn.). 2 μg of genomic DNA was used in digests using the restriction enzymes NcoI, ApaI, and XmaI in separate reactions. The digests were separated on a 1% TAE agarose gel, and the DNA was transferred to nylon membrane (Roche Molecular Biochemicals, Indianapolis, Ind.). DIG-labeled molecular weight markers II and III (Roche) were also included on the gel/membrane. DIG-labeled probes of the PpsR locus were made using a PCR DIG Probe Synthesis Kit (Roche). After baking, membranes were prehybridized in EasyHyb Buffer (Roche) for at least 2 hours and hybridized overnight using 400 nL of a 0.5 DIG labeling per mL of hybridization solution. Detection was done using a Roche Wash and Block Buffer Set (Roche). Membranes were washed two times for 5-10 minutes at room temperature in 2×SSC/0.1% SDS and two times for 15-20 minutes at 68° C. in 0.1×SSC/0.1% SDS. They were then covered with blocking buffer and placed on a shaker for an hour at room temperature. The blocking buffer was replaced with fresh blocking buffer containing 150 mU of AP conjugate per mL of buffer, and the membranes shaken at room temperature for an additional 30 minutes. Membranes were then washed twice for 15 minutes at room temperature with washing buffer, followed by a five minutes wash with detection buffer. The detection buffer was replaced with fresh detection buffer containing 20 μL of NBT/BCIP solution per mL of buffer. This was placed in the dark at room temperature with no shaking until sufficient color was developed.

[0339] In the NcoI digest, the lanes of colony 9 and 10 exhibited a band about 2 Kb larger than the 35053 control, which is the size difference expected from the insertion of the spectinomycin/streptomycin resistance cassette into the XhoI site For the XmaI digest, 35053 exhibited a single band about 5.5 Kb, while colonies 9, 10, and 5 exhibited two bands whose summed size was about 2 Kb higher than that of 35053. Two bands were observed in colony 9, 10, and 5 because a XmaI was introduced along with the spectinomycin/streptomycin resistance cassette. For ApaI digest, the control 35053 sample exhibited two bands since ppsR gene harbors an ApaI site. Each of these bands was about 2.3 Kb in size. Colony 9, 10, and 5 exhibited three bands, whose summed size was about 2 Kb higher band that of 35053. An extra band was observed in colonies 9, 10, and 5 because an ApaI site was introduced along with the spectinomycin/streptomycin resistance cassette.

[0340] The resulting R. sphaeroides mutant containing the ppsR knockout was designated ATCC 35053/AppsR(strep).

[0341] ATCC 35053/AppsR

[0342]R. sphaeroides cells lacking ppsR were made using sacB selection as follows. A three-step PCR process was used to create a 255 bp in-frame deletion in the PpsR gene, so that there would be no residual antibiotic resistance gene in the genome. The PpsR gene from R. sphaeroides strain 35053 was amplified by PCR using primers designed to introduce an SacI restriction site at the beginning of the amplified fragment and a SphI restriction site at the end of the amplified fragment. The sequences of the primers were as follows. PPSRSACF2 5′-GTCAAATGAGCTCCAAACTGGTGAAGACGCTGAAGGACAT-3′ (SEQ ID NO:178) PPSRSPHR 5′-CAGTCGGGCATGCGTCCATTTCAGTTGACATACTTCTGTG-3′ (SEQ ID NO:179)

[0343] The following PCR mix program was used to amplify the PpsR gene. Reaction Mix Program pfu 10X buffer  10 μL 94° C. 2 minutes DMSO  5 μL  8 cycles of: dNTP mix (10 mM)  3 μL 94° C. 30 seconds PPSRSACF2 (100 μM)  1 μL 58° C. 45 seconds PPSRSPHR (100 μM)  1 μL 72° C. 3 minutes Genomic DNA (50 ng/μL)  2 μL 25 cycles of: pfu enzyme (2.5 U/μL)  2 μL 94° C. 30 seconds DI water  76 μL 64° C. 45 seconds 72° C. 3 minutes Total: 100 μL 72° C. 7 minutes  4° C. Until used further

[0344] 100 μL of PCR product was separated on a 1% TAE agarose gel, and a fragment about 1.8 Kb was excised and purified using Qiagen Gel isolation kit.

[0345] The second round of PCR consisted of two separate reactions: reaction A, which used primers PPSRSACF2 and PPSRMIDR, and reaction B, which used primers PPSRSPHR and PPSRMIDF. The sequences of primers PPSRMIDF and PPSRMIDR were as follows. PPSRMIDF 5′-CTCTTGCTCGGCGGCGTGCGGCTCTATCACGAGGGGGTGGA-3′ (SEQ ID NO:180) PPSRMIDR 5′-TCCACCCCCTCGTGATAGAGCCGCACGCCGCCGAGCAAGAG-3′ (SEQ ID NO:181)

[0346] The 20 nucleotides on the 3′ ends of this pair of primers are located near the center of the ppsR gene, 255 bases apart from each other, and facing towards the start (PPSRMIDR) and end (PPSRMIDF) of the gene. The 20 bp on the 5′ ends of these primers are the reverse complement of the 3′ end of the other primer in the pair. The following reaction mix and program were used to conduct these PCR. AProgram Reaction Mix A pfu 10X buffer 10 μL  94° C. 2 minutes DMSO 5 μL 8 cycles of: dNTP mix (10 mM) 3 μL 94° C. 30 seconds PPSRSACF2 (100 μM) 1 μL 58° C. 45 seconds PPSRMIDR (100 μM) 1 μL 72° C. 3 minutes DNA from first round 1 μL 25 cycles of: (10 ng/μL) 94° C. 30 seconds pfu enzyme (2.5 U/μL) 2 μL 64° C. 45 seconds 72° C. 3 minutes DI water 77 μL  72° C. 7 minutes Total 100 μL   4° C. Until further use Reaction Mix B pfu 10X buffer 10 μL  94° C. 2 minutes DMSO 5 μL 8 cycles of: dNTP mix (10 mM) 2 μL 94° C. 30 seconds PPSRSPHR (100 μM) 1 μL 58° C. 45 seconds PPSRMIDF (100 μM) 1 μL 72° C. 3 minutes DNA from first round 1 μL 25 cycles of: (5 ng/μL) 94° C. 30 seconds pfu enzyme (2.5 U/μL) 2 μL 64° C. 45 seconds DI water 78 μL  72° C. 3 minutes 72° C. 7 minutes Total 100 μL   4° C. Until further use

[0347] Both PCR products, about 800-700 bp in length, were separated on a 1% TAE agarose gel, excised, and gel purified using a Qiagen gel isolation kit.

[0348] The third round of PCR used primers PPSRSACF2 and PPSRSPHR but used both fragments derived in the second round of PCR as template. The PCR mixture used was the same as in the first round of PCR except that equal molar amounts of the round 2 fragments were used as template. The PCR program used was also the same as that used in the first round of PCR, with the annealing time lengthened to 1.5 minutes. The 1.5 Kb third-round product was separated on a 1% TAE agarose gel and purified using Qiagen gel isolation kit. The purified DNA was digested overnight at 37° C. with the restriction enzymes SacI and SphI.

[0349] Three μg of the vector pL01 was digested with the restriction enzymes SphI and SacI at 37° C. for 16 hours. The enzymes were inactivated by heating to 65° C. for 20 minutes. Dephosphorylation of the vector was achieved by adding 4.7 μL of shrimp alkaline phosphatase 10×buffer (Roche) and 2 μL of shrimp alkaline phosphatase to the inactivated digest. This mixture was heated at 37° C. for 10 minutes and then 65° C. for 15 minutes. The dephosphorylated vector DNA was then gel purified on a 1.0% TAE agarose gel.

[0350] 98 ng of vector DNA was ligated with 210 ng of the digested third round PCR at 14° C. for 14 hours using T4 DNA ligase (Roche). One μL of ligation mix was electroporated into 40 μL of E coli ElectroMAX™ DH5α™ electrocompetent cells (Life Technologies), which were then recovered in 1 mL of SOC media for one hour and plated on LB media with 25 μg/mL kanamycin (LBK25). Plasmid DNA was isolated from eight individual colonies. Plasmid DNA was checked for correct insert with a PCR screen using the PCR protocol from first round.

[0351] One μL of plasmid DNA was used to transform electrocompetent cells of E. coli strain S17-1. The electroporated cells were recovered in 1 mL of SOC media for one hour and plated on LB media with 25 μg/mL of kanamycin, 25 μg/mL of streptomycin, and 25 μg/mL of spectinomycin (LBKSMST). Single colonies were used to start cultures for plasmid DNA isolation and used in conjugation. These colonies were also plated on LB media containing 5% or 15% sucrose, and 25 μg/mL of kanamycin to ensure that the sacB gene was still functional. Only colonies that showed lethality on the sucrose media were used in conjugation. The presence of the correct insert size was confirmed by colony PCR.

[0352] Growing cultures of R. sphaeroides strain 35053 were subcultured, using ¼ and ⅛ volumes of inoculum, in 5 mL Sistrom's media supplemented with 20% LB and grown at 30° C. for 9 hours. The S17-1 donor colonies were grown in LBKSMST media at 37° C. for 16 hours. 3.0 nL of 35053 and 0.5 mL of S17-1 donor cells were centrifuged and washed four times in Sistrom's media without glucose. Each cell pellet was resuspended into 20 μL LB, and the S17-1 donor suspension was mixed with 35053. The mixture was then spotted on LB, which was incubated at 30° C. for 14-16 hours. The cells were then scraped off the surface of the plate and resuspended in 1.5 mL of Sistrom's salts. 200 μL of resuspended cells were plated on each of the seven Sistrom's media plates that were supplemented with 25 μg/mL of kanamycin.

[0353] Colonies that grew on the plates after about 10-14 days, representing proposed single crossover events, were streaked to new plates of the same media. Upon growth, single colonies were transferred to LBK25 media. These cultures were grown for 36 to 48 hours in Sistrom's media supplemented with 20% LB and no kanamycin at 30° C. 0.1 ∥L and 5 μL of this culture was plated on LB media that was supplemented with Sistrom's salts and 15% sucrose. The plates were placed in an anaerobic chamber (Becton Dickinson, Sparks, MD), and the chamber was placed in a 30° C. incubator. After 4-5 days, several colonies showed up on the plates, indicating the occurrence of double-crossover events. Four colonies from each single-crossover strain were purified by streaking on LB agar plates. Single colonies of double-crossover strains were screen by PCR for integration of truncated version of the ppsR gene into the chromosome. For screening, the following primers were used, which were located upstream and downstream of the PpsR gene. The use of upstream and downstream primer confirms both the locus of integration as well as truncation of PpsR gene. PPSRUPF 5′-GAGCAGCACACTCTGGGAGC-3′ (SEQ ID NO:182) PPSRDNR 5′-CCACACAGGTAGGACACCCAC-3′ (SEQ ID NO:183)

[0354] The following reaction mix and PCR program was used. Reaction Mix Program Taq Mg + 10X buffer  2.5 μL 94° C. 2 minutes DMSO  1.25 μL 29 cycles of: dNTP mix (10 mM)  0.5 μL 94° C. 30 seconds PPSRUPF (100 μM) 0.125 μL 61° C. 45 seconds PPSRDNR (100 μM) 0.125 μL 72° C. 3 minutes Cell boil mix    2 μL 72° C. 7 minutes Taq enzyme (5 U/μL)  0.2 μL  4° C. Until further use DI water  18.3 μL Total   25 μL

[0355] The cell boil mix was prepared by resuspending a single colony in 20-25 μL of water. The suspension was heated at 95° C. for 10 minutes in a PCR machine. The tube was given a quick spin to pellet the solids.

[0356] The colonies that exhibited the truncated version of the PpsR gene were further tested for kanamycin sensitivity by streaking them on LB plates that were supplemented with 25 μg/mL of kanamycin. Also, these colonies were PCR screened for the kanamycin resistance gene.

[0357] The resulting R. sphaeroides mutant containing the ppsR knockout was designated ATCC 35053/ΔppsR.

[0358] ATCC 35053/ΔccoN

[0359]R. sphaeroides cells lacking ccoN were made using sacB selection as follows. A mutant of R. sphaeroides strain 2.4.1 having a 546 bp deletion in the ccoN gene (R. sphaeroides 2.4.1/ΔccoN) was obtained from the laboratory of Samuel Kaplan at the University of Texas (Oh and Kaplan, Biochemistry, 38:2688-2696 (1999)). The mutated ccoN locus of this strain was amplified by PCR and cloned into pL01. This plasmid was transformed into E. coli strain S 17-1. The S 17-1 strain was conjugated with R. sphaeroides strain 35053, and colonies were identified in which the truncated locus had replaced the native ccoN gene.

[0360] The truncated ccoN gene from R. sphaeroides 2.4.1/ΔccoN was amplified by PCR using primers designed to introduce a SacI restriction site at the beginning of the amplified fragment and a SphI restriction site at the end of the amplified fragment. The sequences of the primers were as follows. CCONSACF 5′-TCAGAGCTCGTGTGATCGAATGGGGCTTTGTTCCTTGATG-3′ (SEQ ID NO:184) CCONSPHR 5′-GAAGCATGCAGGTGATCGACGTGCCACTCGTCCGAATAG-3′ (SEQ ID NO:185)

[0361] The PCR reaction mix contained 0.2 μM each primer, 1×Native Pfu reaction buffer, 0.2 nM each dNTP, 5% DMSO, and 10 units of Pfu DNA polymerase in a 200 μL reaction. Three μL of the glycerol stock was diluted in 20 μL of 10 mM Tris and heated at 94° C. for 10 minutes, after which 4 μL was added to the PCR reaction. The PCR was conducted in a MJ Research PT100 and consisted of an initial denaturation at 94° C. for 2 minutes; 32 cycles of a 30 second denaturation at 94° C., a 1 minute annealing at 66° C., and a 4 minute extension at 72° C., followed by a final extension at 72° C. for 7 minutes. The PCR product was separated on a 1% TAE-agarose gel, and a 1.6 Kb fragment was excised and purified. Three fig of purified PCR product was digested with SacI restriction enzyme and separated on a 1% TAE gel. A 1.4 Kb band was excised and purified. A SacI restriction site exists about 200 bp from the CCONSPHR end of the original PCR product.

[0362] Three μg of the vector pL01 was digested with the restriction enzyme SacI. The enzyme was inactivated by heating to 65° C. for 20 minutes, and the digested vector was dephosphorylated using shrimp alkaline phosphatase. The dephosphorylated vector DNA was gel purified on a 1% TAE-agarose gel.

[0363] 50 ng of digested vector DNA was ligated with 65 ng of the digested ccoN PCR product at 16° C. for 16 hours using T4 DNA ligase (Roche). One μL of ligation mix was electroporated into 40 μL of E. coli Electromax™ DH5α™ electrocompetent cells, which were then plated on LBK media. Plasmid DNA was isolated from cultures of individual colonies and digested with the restriction enzyme SacI to confirm correct insert size.

[0364] The E. coli strain S17-1 contains a chromosomal copy of the trans-acting elements that mobilize oriT-containing plasmids during conjugation with a second bacterial strain. It also carries genes conferring resistance to the antibiotics streptomycin and spectinomycin. In addition, S17-1 is a proline auxotroph and will not grow on unsupplemented Sistrom's media. One μL of DNA of the truncated ccoN construct was used to transform electrocompetent cells of E. coli strain S17-1. The electroporation was plated on LBKSMST. Single colonies were used to start cultures for plasmid DNA isolation and used in conjugation. These colonies were also plated on LB media containing 5% sucrose and 25 μg/mL of kanamycin to ensure that the sacB gene was still functional. Only colonies that exhibited lethality on the sucrose media were used in conjugation. The presence of the correct insert size was confirmed by digestion of plasmid DNA with the restriction enzyme Sacd.

[0365] Growing cultures of R. sphaeroides strain 35053 were subcultured in Sistrom's media supplemented with 20% LB to ensure that they were in exponential growth. The S17-1 donor colonies were grown in LBKSMST media at 37° C. overnight or subcultured from growing colonies. 2-4 mL of each culture was centrifuged, and the pellets were washed four times in LB media. Relative pellet size was estimated, and about 2 volumes of 35053 cells were used to 1 volume of S17-1 cells. The cell mixture was then pelleted, resuspended in 20 μL of LB media, and spotted on an LB plate. This plate was incubated at 30° C. for 7-15 hours. The cells were then scraped off the surface of the plate and resuspended in 1.2 mL of Sistrom's salts. 200 μL of resuspended cells were plated on each of six plates of Sistrom's media containing 25 μg/mL of kanamycin (SisK).

[0366] Colonies that grew on the plates after about 10 days, representing potential single-crossover events, were streaked to new plates of SisK media. Upon growth, single colonies were transferred to LBK media. Purified colonies were streaked to Sistrom's media supplemented with IX LB, 15% sucrose, 0.5% DMSO (v/v), and 25 μg/mL kanamycin (SisLBK15%SucDMSO). These were grown in an anaerobic chamber (Becton Dickinson, Sparks, Md.) at 30° C. for 5 days to check for lethality of the sacB gene in the single-crossover events. The purified colonies were also screened in two separate PCR reactions. The first reaction used a primer within the gene of interest (CCONR) together with a primer homologous to upstream sequence (CCONUPF2), and the second reaction used a primer within the gene of interest (CCONSACF) together with a primer homologous to downstream sequence (CCONDNR2). Single-crossover events exhibited a truncated fragment in one of the two reactions, depending on whether the crossover occurred upstream or downstream of the deletion. The primer sequences were as follows. CCONUPF2 5′-CTCACAACCTCCAACCGATG-3′ (SEQ ID NO:186) CCONR 5′-CGATGGTGACCACGAAGAAG-3′ (SEQ ID NO:94) CCONDNR2 5′-CGTAACGCTCGGTCTCGTC-3′ (SEQ ID NO:129)

[0367] Single-crossover colonies were grown in Sistrom's media supplemented with 20% LB. After 2 days of growth, 0.1-1 μL of the cultures was plated on Sistrom's media supplemented with 1×LB, 0.5% DMSO (v/v), and 15% sucrose (SisLB15%SucDMSO). These cultures were grown anaerobically for about 5 days. The sacB gene did not always completely kill cells with the gene, so there was often a background level of very small colonies. The larger colonies, which represented double-crossover events, were purified on LB media and screened by PCR to identify whether they contained the truncated or full-length allele. The CCONUPF2 and CCONDNR2 primers were used in this PCR screen to ensure that the truncated gene also was inserted in the correct location in the genome. Potential double-crossovers were also streaked on LBK plates to confirm that they were now sensitive to kanamycin.

[0368] The resulting R. sphaeroides mutant containing the ccoN knockout was designated ATCC 35053/ΔccoN.

[0369] ATCC 35053/ΔcrtE/ΔccoN

[0370]R. sphaeroides cells lacking crtE and ccoN were made as follows. The wildtype ccoN allele of a crtE knockout mutant (ATCC 35053/ΔcrtE) was replaced with a truncated ccoN allele as described above. Double-crossover colonies having the truncated ccoN allele were then re-screened by PCR for the crtE and ccoN loci. These colonies were plated on LBK25 and screened by PCR to confirm the loss of the vector from the genome. The resulting R. sphaeroides mutant containing the crtE knockout and ccoN knockout was designated ATCC 35053/ΔcrtE/ΔccoN.

[0371] ATCC 35053/ΔcrtE/ΔppsR/ΔccoN

[0372]R. sphaeroides cells lacking crtE, ppsR, and ccoN were made as follows. The wildtype ppsR allele of a crtE/ccoN knockout mutant (ATCC 35053/ΔcrtE/ΔccoN) was replaced with a truncated ppsR allele as described above with the following exceptions. After conjugation on an LB plate, the conjugated cells were plated on Sistrom's media containing 25 μg/mL of kanamycin and 0.5% DMSO (SisKDMSO) rather than on SisK. After purification on SisKDMSO and LBKDMSO, single-crossovers were grown aerobically in Sistrom's media supplemented with 1×LB and 0.5% DMSO. After 2 days of growth, the cultures were plated on Sistrom's media supplemented with 1×LB, 15% sucrose, and 0.5% DMSO, and grown anaerobically for 5 days. Potential double-crossover colonies were purified on LBDMSO and screened by PCR using the PPSRUPF and PPSRDNR primers. Colonies having the truncated ppsR allele were then rescreened by PCR for the crtE, ppsR, and ccoN loci. These colonies were also plated on LBKDMSO and screened by PCR to confirm the loss of the vector from the genome. The resulting R. sphaeroides mutant containing the crtE knockout, ppsR knockout, and ccoN knockout was designated ATCC 35053/ΔcrtE/ΔppsR/ΔccoN.

Example 10 Making Recombinant Microorganisms that Overexpress a Particular Sequence While a Containing Knock-Out

[0373] Any construct developed for the overexpression of genes are transferred to any of the background genotypes developed by gene knockout techniques. For example, the pMCS2tetP/Stdxs/Rsdds/EcUbiC or the pMCS2tetP/Stdxs/Rsdds/RsLytB construct is transferred into the R. sphaeroides ATCC 35053/ΔcrtE/ΔppsR/ΔccoN mutant cells to combine the productive effects of gene overexpression and engineering of gene regulation or carbon flow. The construct is transferred to the desired genotype by electroporation or conjugation. Conjugation of a plasmid into an R. sphaeroides strain follows the procedure described for the isolation of single-crossover events except that, since the efficiency of plasmid transfer is much higher than that of chromosomal integration, a 0.1-1 μL plating volume from the ˜400 μL conjugation recovery is ample to obtain transformed colonies. Single colony PCR is used to check the integrity of the construct in the new background, and evaluations of the productivity of the new strain are made. Genes that are productive are integrated, in one or more copies, into appropriate regions of the chromosome of a productive strain along with or downstream of a highly-expressing promoter.

Example 11 Three Liter Fermentations

[0374] Cultures of R. sphaeroides ATCC 35053 with various inserted genes or knockouts were grown in 5 mL culture tubes containing Sistrom's media with 4 g/L glucose. After 48 hours of growth at 30° C. with 250 rpm shaking, the entire contents of the tube were used to inoculate a 300 mL baffled shake flask containing Sistrom's media with 4 g/L glucose. After incubation at 30° C. for 48 hours, the entire contents of the flask were added to 2.7 L of Sistrom's media containing 40 g/L glucose in a B. Braun Biotech International Model Biostat B fermenter.

[0375] The fermenter was maintained at 30° C., and the cascade was set to maintain the dissolved oxygen (DO) at 40%. The air inflow was maintained at 1 vvm, and the pH was maintained at 7.3 with an automatic feed of 2N NH₄OH. Foaming was controlled by addition of Sigma Antifoam 289. Kanamycin to a concentration of 50 μg/mL was added to fermentations with strains containing the broad host range vector pBBRIMCS2 either with or without an inserted gene. At 24 to 30 hours, when the agitation increase to maintain a DO of 40% had leveled off, the agitation and DO were decoupled, and the agitation was fixed at 240 rpm. The air inflow was lowered to 0.3 vvm. Kanamycin to 50 μg/mL was again added to fermentations containing the expression vector.

[0376] The fermentation samples for coenzyme Q10 and spheroidenone analysis were removed at 69 to 75 hours into the fermentation.

[0377] Example 12

Three-Hundred Milliliter Fermentations

[0378] Cultures of R. sphaeroides ATCC 35053 with various overexpressed genes or knockouts were grown in 5 mL culture tubes containing Sistrom's media with 4 g/L glucose. After 48 hours of growth at 30° C. with 250 rpm shaking, the entire contents of the tube were used to inoculate a 300 mL baffled shake flask containing Sistrom's media with 4 g/L glucose. After incubation at 30° C. for 48 hours, 30 mL of the flask were added to 270 mL of Sistrom's media containing 40 g/L glucose in a 500 mL Infors AG-CH-4103 fermenter.

[0379] The fermenter was maintained at 30° C., and the cascade was set to maintain the dissolved oxygen (DO) at 40%. The air inflow was maintained at 1 vvm, and the pH was maintained at 7.3 with an automatic feed of 2N NH₄OH. Foaming was controlled by addition of Sigma Antifoam 289. Kanamycin to a concentration of 50 μg/mL was added to fermentations with strains containing the broad host range vector pBBRIMCS2 either with or without an inserted gene. At 24 to 30 hours, when the agitation increase to maintain a DO of 40% had leveled off, the agitation and DO were decoupled, and the agitation was fixed at 400 rpm. The air inflow was lowered to 0.3 vvm. Kanamycin to 50 μg/mL was again added to fermentations containing the expression vector.

[0380] The fermentation samples for coenzyme Q10 and spheroidenone analysis were removed at 69 to 75 hours into the fermentation.

Example 13 Analysis of Spheroidenone

[0381] At various times during the fermentation, 15 mL of fermentation volume was withdrawn. The volume of sample needed to obtain 5 mg of dry cell weight (DCW) was used for spheroidenone analysis. The sample was washed one time in water and resuspended in an equal volume of water. The volume of sample calculated in step 1 was added to a 1.8 mL-microfuge tube and was centrifuged at 10,000 rpm for 3 minutes in an IEC MicroMax microfuge. The supernatant was removed, and the pellet was completely resuspended in 1.0 nL of Acetone:Methanol (7:2) and stored at room temperature away from light for 30 minutes. The sample was mixed once during this incubation. After incubation, the sample was centrifuged at 10,000 rpm for 3 minutes, and the extract (supernatant) collected. Samples were stored −20° C. for analysis at a later time. The carotenoid extract was analyzed on a spectrophotometer scanning in the range of 350 nm to 800 nm, and the OD₄₈₀ was recorded. The amount of carotenoid in mg/100 mL of culture was calculated using the following equation:

Spheroidenone (mg)/100 mL culture=((OD ₄₈₀−(0.0816*OD ₇₇₀))*0.484)/Vol. of sample from step 1

[0382] From mg of Spheroidenone/100 mL of culture, the amount of Spheroidenone/mg of dry cell weight (DCW) was calculated using the DCW number as the conversion factor. Care was taken to correct for any dilution factor required while the sample was scanned on the spectrophotometer.

Example 14 Analyzing CoQ(10) Levels Produced via Fermentation

[0383] 100 mL of fermentation broth was removed once per day and placed in a tared 250 mL centrifuge bottle. The samples were centrifuged at 15,000×g for 5 minutes, the supernatant was poured off, and the samples were resuspended in 50 mL cold water. The samples were centrifuged again at 15,000×g for 5 minutes, and the supernatant was poured off. The wet weight of the biomass was determined, and the biomass was resuspended in 1.5 times its weight in water. The samples were stored covered with foil at −80° C. before analysis.

[0384] Before analysis, the samples were warmed at 21° C. for 15 minutes. 1.0 mL was withdrawn. Sodium dodecyl sulfate was added to a final concentration of 1.67%. The samples were extracted with 14 mL of a hexane:ethanol (5:2) mixture. The samples were then evaporated to dryness and dissolved in 2 mL of a methanol:ethanol (9:2) mixture. The samples were then analyzed on a Waters Nova-Pak C18 (3.9×150 mm: 4 Um) column with a PDA detector set from 200-300 um. Resolution was at 1.2 nm with a maximum absorbance at 275 nm. The run time was 15 minutes, and the injection volume was 20 μL.

[0385] The dry weight of the samples were determined drying an aliquot at 105° C. in an aluminum weighing pan for at least four hours.

Example 15 Production of CoQ(10)

[0386] The following seven experiments measured the amount of CoQ(10) produced by the indicated microorganisms in a 3 liter scale fermentation.

[0387] In experiment 1, the following data were collected after 96 hours of fermentation: Strain Coenzyme Q10 (ppm) dry weight basis ATCC 35053 2950 ATCC 35053/ΔcrtE 6508

[0388] These results demonstrated that the inactivation of crtE increased the production of COQ(10).

[0389] In experiment 2, the following data were collected after 69 to 75 hours of fermentation: Strain Coenzyme Q10 (ppm) dry weight basis ATCC 35053 1655 ATCC 35053/ΔppsR(strep) 3812

[0390] These results demonstrated that the inactivation of ppsR increased the production of COQ(10).

[0391] In experiment 3, the following data were collected after 69 to 75 hours of fermentation: Coenzyme Q10 (ppm) Spheroidenone (ppm) Strain dry weight basis dry weight basis ATCC 35053 2951 1980 ATCC 35053/ΔccoN 3527 2959

[0392] These results demonstrated that the inactivation of ccoN increased the production of CoQ(10) and spheroidenone.

[0393] In experiment 4, the following data were collected after 69 to 75 hours of fermentation: Coenzyme Q10 (ppm) dry Strain weight basis ATCC 35053/ΔcrtE 3255 ATCC 35053/ΔcrtE/ΔccoN isolate 8-7 7951

[0394] These results demonstrated that the inactivation of crtE and ccoN increased the production of CoQ(10) as compared to inactivating crtE only.

[0395] In experiment 5, the following data were collected after 69 to 75 hours of fermentation: Coenzyme Q10 (ppm) dry Strain weight basis ATCC 35053/ΔcrtE 3545 ATCC 35053/ΔcrtE/ΔccoN isolate 111 4984 ATCC 35053/ΔcrtE/ΔppsR/ΔccoN 11,676

[0396] These results demonstrated that the inactivation of crtE and ccoN increased the production of CoQ(10) as compared to inactivating crtE only. In addition, these results demonstrated that the inactivation of crtE, ccoN, and ppsR increased the production of CoQ(10) as compared to inactivating only crtE and ccoN.

[0397] In experiment 6, the following data were collected after 69 to 75 hours of fermentation: Coenzyme Q10 (ppm) dry Strain weight basis ATCC 35053/ΔcrtE 3833 ATCC 35053/ΔcrtE/pMCS2tetP/Stdxs 4928 ATCC 35053/ΔcrtE/pMCS2glnP/Stdxs 5508 ATCC 35053/ΔcrtE/pMCS2tetP/Stdds 4652

[0398] These results demonstrated that the inactivation of crtE together with the addition of Stdxs increased the production of CoQ(10) as compared to inactivating crtE only. In addition, these results demonstrated that the use of the gin promoter with Stdxs resulted in more production of CoQ(10) when compared to the use of the tet promoter with Stdxs. Further, these results demonstrated that the inactivation of crtE together with the addition of Stdds increased the production of CoQ(10) as compared to inactivating crtE only.

[0399] In experiment 7, the following data were collected after 69 to 75 hours of fermentation: CoQ(10) (ppm) dry Strain weight basis ATCC 35053/pMCS2tetP 3909 ATCC 35053/pMCS2tetP/Stdxs/Rsdds 5387 ATCC 35053/pMCS2tetP/Stdxs/Rsdds/RsLytB 5962 ATCC 35053/pMCS2tetP/Stdxs/Rsdds/EcUbiC 6439

[0400] These results demonstrated that the addition of Stdxs and Rsdds increased the production of CoQ(10) as compared to adding vector only. In addition, these results demonstrated that the addition of either RsLytB or EcUbiC together with the addition of Stdxs and Rsdds increased the production of CoQ(10) as compared to adding only Stdxs and Rsdds.

[0401] The following four experiments measured the amount of CoQ(10) produced by the indicated microorganisms in a 300 nL scale fermentation.

[0402] In experiment 1, the following data were collected after 69 to 75 hours of fermentation: CoQ(10) (ppm) dry Strain weight basis ATCC 35053/pMCS2tetP 5250 ATCC 35053/pMCS2tetP/Stdxs 5758 ATCC 35053/pMCS2tetP/Rsdds 6944 ATCC 35053/pMCS2tetP/Stdxs/Rsdds 6875 ATCC 35053/pMCS2tetP/Stdxs/Rsdds/EcUbiC 7808

[0403] These results demonstrated that the addition of either Stdxs or Rsdds increased the production of CoQ(10) as compared to adding vector only. In addition, these results demonstrated that the addition of Stdxs, Rsdds, and EcUbiC increased the production of CoQ(10) as compared to adding only Stdxs and Rsdds.

[0404] In experiment 2, the following data were collected after 69 to 75 hours of fermentation: CoQ(10) (ppm) dry Strain weight basis ATCC 35053/pMCS2tetP 5483 ATCC 35053/pMCS2tetP/EcubiC 6360 ATCC 35053/pMCS2tetP/RsLytB 5976 ATCC 35053/pMCS2tetP/Stdxs/Rsdds/RsLytB 6751

[0405] These results demonstrated that the addition of either EcUbiC or RsLytB increased the production of CoQ(10) as compared to adding vector only. In addition, these results demonstrated that the addition of Stdxs, Rsdds, and RsLytB increased the production of CoQ(10) as compared to adding only RsLytB.

[0406] In experiment 3, the following data were collected after 69 to 75 hours of fermentation: CoQ(10) (ppm) dry Strain weight basis ATCC 35053/pMCS2tetP 5072 ATCC 35053/pMCS2tetP/Stdxs/Rsdds/RsLytB 8050

[0407] These results demonstrated that the addition of Stdxs, Rsdds, and RsLytB increased the production of CoQ(10) as compared to adding vector only.

[0408] In experiment 4, the following data were collected after 69 to 75 hours of fermentation: Coenzyme Q10 (ppm) dry Strain weight basis ATCC 35053/pMCS2tetP 4503 ATCC 35053/pMCS2tetP/Stdxs/Rsdds 8833

[0409] These results demonstrated that the addition of Stdxs and Rsdds increased the production of CoQ(10) as compared to adding vector only.

OTHER EMBODIMENTS

[0410] It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

1 190 1 3626 DNA Sphingomonas trueperi misc_feature (1)...(3626) n = A,T,C or G 1 ctgcggccag accacgcata tcgacgacga ttcgatcacg aaaaacgtac ggtccgcagc 60 ccagcacgcc ggtttttcgc cggtccggcc ggtgatcgag gtgcgcggca agtgcggcaa 120 gtgtgactga cctgtccaac agaccgttcg acttgagact aacgttgcgc taacaaagcc 180 catggctgac ctacccaaga cgccgctgct cgacacggtc gacacgccgc aggacctccg 240 gaagctcgcc cccgcccagc tgcgccagct ggccgacgag cttcgtgccg aaaccatcag 300 tgcggtgggc tccaccggcg ggcatctagg ctccggcctg ggcgtcgtcg aactgacggt 360 ggcgatccac tatgtattca acacccccga cgaccggctg atctgggacg tcgggcacca 420 atgctatccg cacaagatcc tcaccggtcg gcgcgatcgg atccgcacga ttcgtcaggg 480 tggaggcctc tccggcttca ccaagcgcag cgagagcgag tatgatccgt tcggtgccgc 540 gcactcgtcg acctcgatct cggccgcact cggctttgcg atcgccaaca agctcaacga 600 ggcgccgggc aaggcgatcg cggtgatcgg cgacggcgcg atgagcgcgg gcatggccta 660 tgaggcgatg aacaacgccg aggccgccgg caaccggctg gtggtgatcc tcaacgacaa 720 cgacatgtcg atcgccccgc cggtgggcgg gctttcggcc tatcttgcgc gcctcatttc 780 ctcgtccgaa tatctcggcc tgcgcgagct cgccaagcgc ttcacccgca agctttcgcg 840 ccgcctcacc gcggcagccg gcaaggcgga ggaattcgcc cgcggcatgg cgaccggcgg 900 cacgctgttc gaggaacttg gcttctatta tgtcggcccg atcgacggcc acaatctcga 960 gcatctgatc ccggtgctgg agaatgtccg cgacagcgag cagggcccga tcctgatcca 1020 tgtcgtgacc aagaagggca agggctatgc cccggccgaa gcggcggcgg acaagtatca 1080 cggcgtccag aagttcgacg tgatcaccgg ggcacaggcc aaggcacccc cgggcccgcc 1140 cgcctatacc aaggtgttcg ccgatgcgct gctcgccgaa gcggagcgtg atgcgtcggt 1200 ctgcgcgatc accgcggcga tgccctcggg caccgggctc gacaagttcc aggcgacgtt 1260 ccccgatcgc accttcgacg tgggcattgc cgaacagcac gcggtcacct tcgcagcggg 1320 ccttgccgcg caggggatgc ggccgttctg cgcgatctac tcgaccttcc tgcagcgcgc 1380 ctacgaccag gtcgtccacg acgtcgcgat ccagaacctg ccggtccgct tcgcgatcga 1440 ccgcgcgggc ctggtcggtg ccgacggcgc gacccatgcc ggcagcttcg acgtgaccta 1500 tctcgccagc ctgcccaatt tcgtggtgat ggcggccgcg gacgaggtcg agctcgtcca 1560 catgacccac acggcggcga tgcacgacag cggcccgatc gcgctgcgct atccacgcgg 1620 caacggcgtc ggactggcgc tgcccaaggt tccggagcgg ctggaaatcg gcaagggtcg 1680 cgtggtccga gagggcaaga aggtagcgat cctgtcgctc ggcacgcgcc ttgcggaagc 1740 actaaaggcc gccgacacgc tcgaggccaa gggcctctcg accaccgtcg ccgacctgcg 1800 cttcgccaaa ccgctcgacg aggatctgat ccgccgcctg ctcaccaccc acgaagtggc 1860 ggtgacgatc gaggaaggcg cgatcggcgg ccccggtgcg catgtgctga cgctcgccag 1920 cgataccggc ctgatcgacg ccggcctcaa gctgcgcacc atgcgcctgc cggacatatt 1980 ccaggaccag gacaagcccg agaagcagta tgacgaagcg gggctgaacg ccgccaacat 2040 cgtcgacacg gtgctgaagg cgctccgcta caacgaggcc gagctggccg acggggtgcg 2100 ggcgtaaacg acgccagatc ctccccggaa cggggagggg aaccgccgcc gaaggcggtg 2160 gtggaggggc cgctgcggca cgcancggtt tcccaggctg agagcgatcc gcgccttgcg 2220 gcgcgccccc cccaccattc gctggcgcgg atggtccccc tccccgttcc ggggaggatc 2280 tgggtcctgc cccaccttga atctccaaca tgcacatgcc atgtacatgc acatggctac 2340 gcagcttccc cagactcgct ccagccgcgt tgtcgtgctg gtatcgcccg aggaaaaacg 2400 gcgcatttcc gccaatgcgg aagcggcgga catgacggtc agcgacttca tgcgcaccgc 2460 cgccgaacgc tataccgagc cgaccgacgc cgagatggcg ctgatgcgcg acctgctcgc 2520 ccagctcgaa caggccaatg cccgcacgga cgcggccttt gcccagctcg aagctgcgcg 2580 cgccgccgcc accgcgttcg acgaggaggc gtatcgcgcc gaggtccgcg aacagctgct 2640 gaccgatacc tcaatcgact gggatgcgct gtccactgcc ctttccggct gggcgcgcca 2700 gtgagcttct ggaccgatgc gctacgcgcg ctccagcagg tcgcgctgct ccagcacaag 2760 gtcgagcagg cgctgaccac cgccgaggaa gcccgccgcc attcaatcga gacgcgcgag 2820 cgggtgatcc ggcttgagac gctgatcgac atcgcgatga gacgccagcc cgcagcaccg 2880 cctacgccgc ctgcgcttcc cgaaagtcca caaaccggca gctagcgccc gcttccccga 2940 gcgcgtacat cgcggtacgt gctgaaaatg accatccttc ccctcaccgc ccgcccccgc 3000 gcgctcgcgc actggctgtt cgtcgtcgcc gcgatgatcg tcgcgatggt cgtggtcggg 3060 ggcattaccc ggctcaccga atcgggcctg tcgatcaccg aatggaagcc aatctccggc 3120 atcgtgcccc cgctcaacga cgcgcagtgg caggccgagt tcgaccacta caagcagatc 3180 ggccagtatg agcagctcaa ccagggcatg acgctcggcg ggttcaagag catcttcttc 3240 tgggaatata tccaccgcct gctcggccgg ctgatcggca tggtgttcgc gctgccgctg 3300 ctgtggttcg ccgtccgcaa gcagatcccg cagggctatg gctggcggct ggtcgcgctg 3360 ctcgcgctag gcgggctgca gggcgcgttc ggctggtgga tggtgaagtc ggggctcaac 3420 cacacccgca cctcggttag ccatttctgg ctggcgaccc acctgatgac cgcactgttc 3480 acgctgggcg gcatcgtctg gacgatgctc gacctgcgcg cgcttgccgc caaccatgcc 3540 gagcgccctg cccgactgac cgggctcggc gcgggcgtgc tggtactgct ggcggtccag 3600 ctcttctacg gggcgctggt agcagg 3626 2 1926 DNA Sphingomonas trueperi 2 atggctgacc tacccaagac gccgctgctc gacacggtcg acacgccgca ggacctccgg 60 aagctcgccc ccgcccagct gcgccagctg gccgacgagc ttcgtgccga aaccatcagt 120 gcggtgggct ccaccggcgg gcatctaggc tccggcctgg gcgtcgtcga actgacggtg 180 gcgatccact atgtattcaa cacccccgac gaccggctga tctgggacgt cgggcaccaa 240 tgctatccgc acaagatcct caccggtcgg cgcgatcgga tccgcacgat tcgtcagggt 300 ggaggcctct ccggcttcac caagcgcagc gagagcgagt atgatccgtt cggtgccgcg 360 cactcgtcga cctcgatctc ggccgcactc ggctttgcga tcgccaacaa gctcaacgag 420 gcgccgggca aggcgatcgc ggtgatcggc gacggcgcga tgagcgcggg catggcctat 480 gaggcgatga acaacgccga ggccgccggc aaccggctgg tggtgatcct caacgacaac 540 gacatgtcga tcgccccgcc ggtgggcggg ctttcggcct atcttgcgcg cctcatttcc 600 tcgtccgaat atctcggcct gcgcgagctc gccaagcgct tcacccgcaa gctttcgcgc 660 cgcctcaccg cggcagccgg caaggcggag gaattcgccc gcggcatggc gaccggcggc 720 acgctgttcg aggaacttgg cttctattat gtcggcccga tcgacggcca caatctcgag 780 catctgatcc cggtgctgga gaatgtccgc gacagcgagc agggcccgat cctgatccat 840 gtcgtgacca agaagggcaa gggctatgcc ccggccgaag cggcggcgga caagtatcac 900 ggcgtccaga agttcgacgt gatcaccggg gcacaggcca aggcaccccc gggcccgccc 960 gcctatacca aggtgttcgc cgatgcgctg ctcgccgaag cggagcgtga tgcgtcggtc 1020 tgcgcgatca ccgcggcgat gccctcgggc accgggctcg acaagttcca ggcgacgttc 1080 cccgatcgca ccttcgacgt gggcattgcc gaacagcacg cggtcacctt cgcagcgggc 1140 cttgccgcgc aggggatgcg gccgttctgc gcgatctact cgaccttcct gcagcgcgcc 1200 tacgaccagg tcgtccacga cgtcgcgatc cagaacctgc cggtccgctt cgcgatcgac 1260 cgcgcgggcc tggtcggtgc cgacggcgcg acccatgccg gcagcttcga cgtgacctat 1320 ctcgccagcc tgcccaattt cgtggtgatg gcggccgcgg acgaggtcga gctcgtccac 1380 atgacccaca cggcggcgat gcacgacagc ggcccgatcg cgctgcgcta tccacgcggc 1440 aacggcgtcg gactggcgct gcccaaggtt ccggagcggc tggaaatcgg caagggtcgc 1500 gtggtccgag agggcaagaa ggtagcgatc ctgtcgctcg gcacgcgcct tgcggaagca 1560 ctaaaggccg ccgacacgct cgaggccaag ggcctctcga ccaccgtcgc cgacctgcgc 1620 ttcgccaaac cgctcgacga ggatctgatc cgccgcctgc tcaccaccca cgaagtggcg 1680 gtgacgatcg aggaaggcgc gatcggcggc cccggtgcgc atgtgctgac gctcgccagc 1740 gataccggcc tgatcgacgc cggcctcaag ctgcgcacca tgcgcctgcc ggacatattc 1800 caggaccagg acaagcccga gaagcagtat gacgaagcgg ggctgaacgc cgccaacatc 1860 gtcgacacgg tgctgaaggc gctccgctac aacgaggccg agctggccga cggggtgcgg 1920 gcgtaa 1926 3 641 PRT Sphingomonas trueperi 3 Met Ala Asp Leu Pro Lys Thr Pro Leu Leu Asp Thr Val Asp Thr Pro 1 5 10 15 Gln Asp Leu Arg Lys Leu Ala Pro Ala Gln Leu Arg Gln Leu Ala Asp 20 25 30 Glu Leu Arg Ala Glu Thr Ile Ser Ala Val Gly Ser Thr Gly Gly His 35 40 45 Leu Gly Ser Gly Leu Gly Val Val Glu Leu Thr Val Ala Ile His Tyr 50 55 60 Val Phe Asn Thr Pro Asp Asp Arg Leu Ile Trp Asp Val Gly His Gln 65 70 75 80 Cys Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Asp Arg Ile Arg Thr 85 90 95 Ile Arg Gln Gly Gly Gly Leu Ser Gly Phe Thr Lys Arg Ser Glu Ser 100 105 110 Glu Tyr Asp Pro Phe Gly Ala Ala His Ser Ser Thr Ser Ile Ser Ala 115 120 125 Ala Leu Gly Phe Ala Ile Ala Asn Lys Leu Asn Glu Ala Pro Gly Lys 130 135 140 Ala Ile Ala Val Ile Gly Asp Gly Ala Met Ser Ala Gly Met Ala Tyr 145 150 155 160 Glu Ala Met Asn Asn Ala Glu Ala Ala Gly Asn Arg Leu Val Val Ile 165 170 175 Leu Asn Asp Asn Asp Met Ser Ile Ala Pro Pro Val Gly Gly Leu Ser 180 185 190 Ala Tyr Leu Ala Arg Leu Ile Ser Ser Ser Glu Tyr Leu Gly Leu Arg 195 200 205 Glu Leu Ala Lys Arg Phe Thr Arg Lys Leu Ser Arg Arg Leu Thr Ala 210 215 220 Ala Ala Gly Lys Ala Glu Glu Phe Ala Arg Gly Met Ala Thr Gly Gly 225 230 235 240 Thr Leu Phe Glu Glu Leu Gly Phe Tyr Tyr Val Gly Pro Ile Asp Gly 245 250 255 His Asn Leu Glu His Leu Ile Pro Val Leu Glu Asn Val Arg Asp Ser 260 265 270 Glu Gln Gly Pro Ile Leu Ile His Val Val Thr Lys Lys Gly Lys Gly 275 280 285 Tyr Ala Pro Ala Glu Ala Ala Ala Asp Lys Tyr His Gly Val Gln Lys 290 295 300 Phe Asp Val Ile Thr Gly Ala Gln Ala Lys Ala Pro Pro Gly Pro Pro 305 310 315 320 Ala Tyr Thr Lys Val Phe Ala Asp Ala Leu Leu Ala Glu Ala Glu Arg 325 330 335 Asp Ala Ser Val Cys Ala Ile Thr Ala Ala Met Pro Ser Gly Thr Gly 340 345 350 Leu Asp Lys Phe Gln Ala Thr Phe Pro Asp Arg Thr Phe Asp Val Gly 355 360 365 Ile Ala Glu Gln His Ala Val Thr Phe Ala Ala Gly Leu Ala Ala Gln 370 375 380 Gly Met Arg Pro Phe Cys Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala 385 390 395 400 Tyr Asp Gln Val Val His Asp Val Ala Ile Gln Asn Leu Pro Val Arg 405 410 415 Phe Ala Ile Asp Arg Ala Gly Leu Val Gly Ala Asp Gly Ala Thr His 420 425 430 Ala Gly Ser Phe Asp Val Thr Tyr Leu Ala Ser Leu Pro Asn Phe Val 435 440 445 Val Met Ala Ala Ala Asp Glu Val Glu Leu Val His Met Thr His Thr 450 455 460 Ala Ala Met His Asp Ser Gly Pro Ile Ala Leu Arg Tyr Pro Arg Gly 465 470 475 480 Asn Gly Val Gly Leu Ala Leu Pro Lys Val Pro Glu Arg Leu Glu Ile 485 490 495 Gly Lys Gly Arg Val Val Arg Glu Gly Lys Lys Val Ala Ile Leu Ser 500 505 510 Leu Gly Thr Arg Leu Ala Glu Ala Leu Lys Ala Ala Asp Thr Leu Glu 515 520 525 Ala Lys Gly Leu Ser Thr Thr Val Ala Asp Leu Arg Phe Ala Lys Pro 530 535 540 Leu Asp Glu Asp Leu Ile Arg Arg Leu Leu Thr Thr His Glu Val Ala 545 550 555 560 Val Thr Ile Glu Glu Gly Ala Ile Gly Gly Pro Gly Ala His Val Leu 565 570 575 Thr Leu Ala Ser Asp Thr Gly Leu Ile Asp Ala Gly Leu Lys Leu Arg 580 585 590 Thr Met Arg Leu Pro Asp Ile Phe Gln Asp Gln Asp Lys Pro Glu Lys 595 600 605 Gln Tyr Asp Glu Ala Gly Leu Asn Ala Ala Asn Ile Val Asp Thr Val 610 615 620 Leu Lys Ala Leu Arg Tyr Asn Glu Ala Glu Leu Ala Asp Gly Val Arg 625 630 635 640 Ala 4 2208 DNA Chlamydomonas reinhardtii 4 atgctgcgtg gtgctgtttc tcacggccct gcggtcgccg accgggctgc cgctggcccc 60 gcccgctgcg ctgctcccgt cgcccgtggt gtgcgcagcg cagcgcccac gcgtcagcgt 120 cgcgcggagg cttcggtcaa tgccccgcgg gcgggcccgg ccggtagcta ctcgggcgag 180 tgggataagc tttcagtgga ggagattgat gagtggcgcg atgtgggccc gaagacgccc 240 ctgctggaca ctgtcaatta cccggtgcac ctgaagaact tcaacaatga gcagctgaag 300 cagctctgca aggagctgcg cagtgacatc gtgcacaccg tctctcgcac cggtggacac 360 cttagcagca gcctgggcgt ggtggagctg acggtggcta tgcactatgt attcaacacc 420 ccggaggaca agattatttg ggacgtgggc caccaggcgt atggccacaa gatcctgact 480 ggccgtcgca agggtatggc cacgattcgc cagaccaacg gcctttcggg cttcacgaag 540 cgcgacgaga gcgagtacga ccctttcggc gctggccaca gctccacctc gatttcggcg 600 gctctgggta tggcggtggg ccgcgacgtt aagggcaaga agaacagtgt gatcgctgtc 660 atcggcgacg gcgccatcac cgggggtatg gcctatgagg ccatgaacca tgcgggcttc 720 ctggacaaga acatgattgt gattctgaac gacaaccagc aggtgtcgct gcccacgcag 780 tacaacaaca agaaccagga ccccgtgggc gccctgtcca gcgccctggc gcgcctgcag 840 gccaaccggc ccctgcgcga gctgcgcgag attgccaagg gcgtgaccaa gcagctgcct 900 gacgttgtcc agaaggcaac tgctaagatt gacgagtatg ctcgcggcat gatcagcggc 960 actggctcca cgctgtttga ggagctgggc ctgtactaca tcggccctgt ggacggccac 1020 aacctggacg acctcatcgc cgtgctcagc gaggtgcgca gcgccgagac cgtgggcccg 1080 gtgctggtgc acgtggtaac ggagaagggc cgcggctacc tgcccgccga gacggcgcag 1140 gacaagatgc acggtgtggt caagttcgac ccccgcaccg gcaagcaggt gcaggccaag 1200 acgaaggcca tgtcgtacac gaactacttc gcggacgcgc tgacggcgga ggcggagcgc 1260 gacagccgca tcgtggcggt gcacgcggcc atggcgggcg gcaccggcct gtaccggttc 1320 gagaagaagt tcccggaccg cacctttgac gtgggcattg cggagcagca cgccgtgacc 1380 tttgctgccg gcctggcgtg cgagggcctg gtgcccttct gcaccatcta cagtaccttc 1440 atgcagcgcg gttacgacca gatcgtgcac gacgtgtccc tgcagaagct gcctgtgcgc 1500 ttcgctatgg accgcgctgg cctggtgggc gctgacggct ccacgcactg cggcgccttc 1560 gacgtgacgt tcatggcgtc gctgccgcac atgatcacca tggctccctc gaacgaggcg 1620 gagctcatca acatggtggc cacctgcgcc gccatcgacg acgcgccctc gtgcttccgc 1680 ttcccccgcg gcaacggcct gggcctggac ctggccgcct acggcatcag caaggacctg 1740 aagggtgtgc ccctcgaggt gggcaagggt gttgtccgcc gccagggcaa ggacgtgtgc 1800 ctggtggcgt acggcagcag tgtgaacgag gcgctggccg cggcggacat gctggagcgc 1860 gatggcgtgt ccaccaccgt cattgacgcg cgcttctgca agcctctgga caccaagctg 1920 atccgctcgg ctgccaagga gcaccctgtc atgatcacca tcgaggaggg ctccgtgggt 1980 ggcttcgctg cgcacgtgat gcagttcctc gcactggagg gcctgctgga cggcgggctc 2040 aagttccggc ccatgacgct gccggaccgc tacatcgacc acggcgacta ccgcgaccag 2100 ctggccatgg ccggcctcac cagccagcac atcgcctcca ccgcgctcac caccctgggg 2160 cgcgccaagg acgccgccaa gttctcactg tcagcgctgc aagcgtaa 2208 5 1849 DNA Campylobacter jejuni 5 atgagtaaaa aatttgccca tactcaagaa gagttagaaa agctaagttt aaaagaatta 60 gaaaatttag cagcatctat gcgtgaaaaa atcatacaag ttgtgagtaa aaatggtggg 120 catttaagtt caaatttggg tgctgtagaa cttagtatag caatgcattt ggtttttgat 180 gcaaaaaaag atccttttat ttttgatgtg tcgcatcagt cttatacaca caagctttta 240 agcggaaaag aagaaatatt tgatacttta agacaaatca atggtttaag tggttataca 300 aaacctagcg agggagatta ttttgtagca gggcattcta gtacctctat ttctttggca 360 gtaggtgctt gtaaggctat tgctttaaag ggtgaaaagc gtattcctgt tgctttgatt 420 ggagatggtg ctttaagtgc gggtatggcc tatgaggctt taaatgaatt gggtgattct 480 aaatttcctt gcgtaatact tttaaatgat aatgaaatga gtatttcaaa accaattgga 540 gcaatttcaa agtatctttc tcaggctatg gcaacgcagt tttatcaaag ttttaaaaag 600 cgtattgcta aaatgttgga tatattgcct gatagtgcta cttatatggc caagcgtttt 660 gaagagagtt ttaaacttat tacccctggg cttttgtttg aagaattagg gcttgaatat 720 atagggccta ttgatggaca taatttaggt gaaattattt ctgcattaaa acaagcaaaa 780 gctatgcaaa agccttgtgt gatacatgct caaaccataa agggtaaagg ctatgcttta 840 gctgaaggaa aacatgctaa atggcacggg gtgggagcct ttgatataga tagtggagag 900 agtgttaaaa aaagtgatac taaaaaatct gctactgaaa ttttttctaa gaatttgctt 960 gatttagcct caaaatatga aaatattgtt ggggttacgg cggctatgcc aagtggaaca 1020 ggtcttgata agcttataga aaaatatcca aatcgttttt gggatgtggc tattgcagaa 1080 cagcatgcag taacttctat ggccgctatg gcaaaagaag gatttaaacc ttttattgca 1140 atatatagca cctttttgca gcgtgcttat gatcaagtga tccatgattg tgcgattatg 1200 aaatttaaat gtggtttttg ctatggatag ggcagggata gtaggcgaag atggggagac 1260 gcatcaaggt gtttttgatc ttagtttttt agctcctttg ccaaatttca ctcttttagc 1320 cccaagagat gaacaaatga tgcaaaatat aatggagtat gcttatttac atcaaggacc 1380 tattgctttg cgttatccta gagggagttt tattttggat aaagaattta atccttgtga 1440 gataaaactt ggtaaggcac aatggcttgt aaaaaataat agtgaaattg cttttttagg 1500 ttatggacaa ggtgtggcaa aagcgtggca agtcttaaga gccttgcaag aaatgaataa 1560 taatgctaat ttgattgatt taatttttgc taaaccttta gatgaagagc ttttgtgtga 1620 gcttgctaaa aaaagtaaaa tttggtttat ttttagtgaa aatgttaaaa ttggcggtat 1680 agaaagttta attaataatt ttttacaaaa atatgatttg catgtaaaag ttgttagctt 1740 tgaatatgaa gacaaattta ttgaacatgg aaaaacaagt gaggtggaaa aaaatctaga 1800 aaaagatgtc aatagtttgt tgacgaaagt tttaaaattt tatcattaa 1849 6 1884 DNA Pseudomonas aeruginosa 6 atgcccaaga cgctccatga gattccccgc gagcgccccg ccacgcccct gctcgaccgc 60 gcctcttcgc cggccgaact gcgccggctg ggcgaggcgg acctggaaac cctggccgac 120 gagctgcgcc agtacctgct gtataccgtc ggccagaccg gcggtcattt cggcgccggc 180 ctcggcgtgg tcgagctgac cattgccctg cactacgtct tcgacactcc ggacgaccgc 240 ctggtctggg acgtcggcca ccaggcctat ccgcacaaga tcctcaccga gcgccgcgag 300 ctgatgggca ccctgcgcca gaagaacggc ctggcggcct tcccgcgccg cgcagagagc 360 gagtacgaca ccttcggcgt cggccactcc agcacctcca tcagcgccgc cctgggcatg 420 gccatcgccg cccgcctgca aggcaaggag cgtaagtcgg tggccgtgat cggcgacggt 480 gcgctgaccg ccggcatggc cttcgaggca ctcaaccacg cctcggaagt cgacgccgac 540 atgctggtga tcctcaacga caacgacatg tcgatctcgc acaacgtcgg cgggctctcc 600 aactacctgg cgaagatcct ctccagccgc acctatagca gcatgcgcga gggcagcaag 660 aaggtgctct cgcgcctgcc cggggcctgg gagatcgccc ggcgcaccga ggaatacgcc 720 aagggcatgc tggtccccgg caccctgttc gaggagctcg gctggaatta catcggcccg 780 atcgacggcc acgacctgcc gaccctggtg gctaccctgc gcaacatgcg cgacatgaag 840 ggcccgcagt tcctccatgt ggtgaccaag aaaggcaagg gcttcgcccc ggccgaactg 900 gatccgatcg gctaccacgc gatcaccaag ctggaagctc ccggcagtgc gccgaagaag 960 accggcggac ccaagtattc cagcgtcttc ggccagtggc tgtgcgacat ggccgcccag 1020 gacgcgcgcc tgctcggcat caccccggcg atgaaggaag gttccgacct ggtggccttc 1080 agcgaacgtt atccggaacg ctacttcgac gtcgccatcg ccgaacagca tgccgtgacc 1140 ctggccgccg gcatggcctg cgagggcatg aagccggtgg tagcgatcta ctcgaccttc 1200 ctccagcgcg cctacgacca gttgatccat gacgtcgccg tgcagcacct cgacgtgctg 1260 ttcgccatcg accgcgccgg cctggtcggc gaggacggcc cgacccacgc cggtagcttc 1320 gacatctcct acctgcgctg catccccggc atgctggtga tgacccccag cgacgaggac 1380 gagctgcgca agctgctcac caccggctac ctgttcgatg gcccggccgc ggtgcgctat 1440 ccgcgcggca gcggccccaa ccatccgatc gatccggacc tgcaaccggt ggagatcggc 1500 aagggcgtgg tccgtcggcg cggcggcagg gtcgcactgc tggtcttcgg cgtgcagttg 1560 gcggaggcga tgaaggtcgc cgaaagcctc gacgccacgg tcgtcgacat gcgtttcgtc 1620 aaacccctcg acgaagccct ggtacgcgaa ttggcgggca gccacgaact gctggtgacc 1680 atcgaggaaa acgccgtgat gggcggcgcc ggctcggcgg tcggcgagtt cctcgccagc 1740 gagggcctcg aagtcccgct gctgcaactg ggcctgcccg actactacgt cgaacacgcc 1800 aagcccagcg agatgctcgc cgaatgcggc ctggatgccg cgggcatcga aaaggcagta 1860 cgccagcgtc tcgaccggca gtag 1884 7 2160 DNA Lycopersicon esculentum 7 atggctttgt gtgcttatgc atttcctggg attttgaaca ggactggtgt ggtttcagat 60 tcttctaagg caaccccttt gttctctgga tggattcatg gaacagatct gcagtttttg 120 ttccaacaca agcttactca tgaggtcaag aaaaggtcac gtgtggttca ggcttcctta 180 tcagaatctg gagaatacta cacacagaga ccgccaacgc ctattttgga cactgtgaac 240 tatcccattc atatgaaaaa tctgtctctg aaggaactta aacaactagc agatgaacta 300 aggtcagata caattttcaa tgtatcaaag actgggggtc accttggctc aagtcttggt 360 gttgttgagc tgactgttgc tcttcattat gtcttcaatg caccgcaaga taggattctc 420 tgggatgttg gtcatcagtc ttatcctcac aaaatcttga ctggtagaag ggacaagatg 480 tcgacattaa ggcagacaga tggtcttgca ggatttacta agcgatcgga gagtgaatat 540 gattgctttg gcaccggcca cagttccacc accatctcag caggcctagg gatggctgtt 600 ggtagagatc taaaaggaag aaacaacaat gttattgccg taataggtga tggtgccatg 660 acagcaggtc aagcttatga agccatgaat aatgctggtt acctggactc tgacatgatt 720 gttatcttaa acgacaatag acaagtttct ttacctactg ctactctgga tgggccagtt 780 gctcctgttg gagctctaag tagtgctttg agcaggttac agtctaatag gcctctcaga 840 gaactaagag aagtcgcaaa gggagttact aagcagattg gtggtcctat gcatgagctt 900 gctgcaaaag ttgatgaata tgctcgtggc atgattagtg gttctggatc aacattgttt 960 gaagaacttg gactttacta tattggtcct gtggatggtc acaacattga tgatctaatt 1020 gcgattctca aagaggttag aagtactaaa acaacaggtc cagtactgat ccatgttgtc 1080 actgagaaag gcagaggtta tccatatgct gagagagctg cagataagta tcatggagtt 1140 gccaagtttg atccagcaac aggaaagcaa ttcaaagcca gtgccaagac acagtcctat 1200 acaacatatt ttgccgaggc tttaattgca gaagcagaag cagataaaga cattgttgca 1260 atccatgctg ccatgggggg tgggaccgga atgaaccttt tccatcgtcg cttcccaaca 1320 aggtgttttg atgttggaat agcagaacaa catgcagtaa cctttgctgc tggattggct 1380 tgtgaaggca ttaaaccttt ctgtgcaatc tattcgtctt tcatgcagag ggcttatgac 1440 caggtagtgc atgacgttga tttgcaaaag ctgcccgtga ggtttgcaat ggacagagca 1500 ggtcttgttg gagcagatgg tccaacacat tgtggtgcat ttgatgttac ttacatggca 1560 tgtcttccta acatggttgt aatggctcct tctgatgaag cggagctatt tcacatggta 1620 gcaactgctg ccgccattga tgacagacca agttgtttta gatacccaag aggaaatggg 1680 atcggtgtag agcttccggc tggaaacaaa ggaattcctc ttgaggttgg taaaggtagg 1740 atattgattg agggggagag agtggctcta ttgggatatg gctcagcagt gcagaactgt 1800 ttggatgctg ctattgtgct agaatcccgc ggcttacaag taacagttgc agatgcacgt 1860 ttctgcaaac cactggacca tgccctcata aggagccttg caaaatcaca tgaagtgcta 1920 atcactgtcg aagaaggatc aattggaggt tttggatctc atgttgttca gttcatggcc 1980 ttagatgggc ttcttgatgg caagttgaag tggagaccaa tagttcttcc tgatcgatac 2040 attgaccatg gatctcctgt tgatcagttg gcggaagctg gcctaacacc atctcacatt 2100 gcagcaacag tatttaacat acttggacaa accagagagg ctctagaggt catgacataa 2160 8 1916 DNA Mycobacterium tuberculosis 8 atgctgcaac agatccgcgg gcccgctgat ctgcagcacc tttcccaggc gcagcttcgg 60 gagctggccg ccgagatccg tgagttcctg atccacaagg ttgccgccac gggggggcat 120 ctggggccga acctgggagt ggtggaactc accttggcgc tgcaccgggt attcgactcg 180 ccgcacgatc cgatcatctt cgacaccggt caccaggcct acgtccacaa gatgttgacc 240 ggacgcagcc aggacttcgc aaccctgcgt aagaagggcg ggttgtcggg gtatccgtct 300 cgtgccgaga gcgagcacga ctgggtggag tcgagccacg ccagcgcggc gctgtcgtac 360 gcggacgggt tggccaaggc gttcgagttg accggacacc gcaaccggca tgtggtcgcg 420 gtggtcggtg acggtgcgct caccggcggt atgtgctggg aggcgctgaa caatatcgcc 480 gcatcccgcc ggccggtgat tatcgtggtc aacgacaatg ggcgcagcta cgcgcccaca 540 atcgggggcg tcgccgacca tctggccacg ctgcggctgc agccggccta cgagcaggcg 600 ctggagacgg gccgcgacct ggtgcgcgcg gtgccgcttg tcggcggtct gtggtttcga 660 tcctgcacag cgtcaaggcc ggcatcaagg actcgctgtc gccgcagttg ctgttcaccg 720 acctcgggtt gaagtacgtc ggcccggtcg acggccatga cgagcgggcg gtggaggtcg 780 cgctgcgcag cgcgcggcgc ttcggtgcac cggtgatcgt gcacgtcgtc acccgcaagg 840 gcatgggcta cccgccggcc gaggccgacc aggccgagca gatgcattcc acggtcccga 900 tcgatccggc caccggacaa gccaccaagg tggccggccc aggctggacg gcgaccttct 960 ctgatgcact tatcggctac gcccagaaac gccgtgacat cgtggccatt accgcggcca 1020 tgccgggccc caccgggctg accgcgttcg ggcagcgctt cccggatcga ttgttcgacg 1080 tcgggatcgc cgagcaacac gcgatgacgt cggcggccgg gttggcgatg ggtgggctgc 1140 accccgtggt ggcgatctac tcgacgttcc tgaaccgggc gttcgaccag atcatgatgg 1200 atgtggcgct gcacaagctg ccggtcacca tggtgctgga ccgtgccggg atcaccggta 1260 gcgacggcgc cagccacaac ggaatgtggg acttgtcgat gctgggtatc gtgcccggca 1320 tccgggtggc agcgcccaga gacgccaccc ggttgcgtga agaactcggc gaggcgctcg 1380 acgtcgacga cggcccgacg gcgttacggt tccccaaagg tgatgtggga gaagatattt 1440 cggctttgga gcggcgtgga ggcgtggatg tgctggcggc gcccgccgat ggtttgaacc 1500 acgacgtcct gttggtggcc atcggcgcgt tcgcaccgat ggcgttggcg gtggccaagc 1560 ggctgcacaa ccaggggatc ggtgtgacgg tgatcgaccc gcgctgggtg ttgccggtgt 1620 ctgacggtgt gcgcgaactg gcggtgcagc acaagctgct cgtcacgcta gaggacaacg 1680 gggtcaacgg tggggcgggg tcagcggtgt cggccgcgct gcggcgcgcg gagatcgacg 1740 tgccctgccg cgatgtcggg ttgccgcagg agttctacga gcacgcgtct cgaagcgagg 1800 tgctggccga tctggggctt accgaccagg acgtggcccg gcggatcacc ggctgggtcg 1860 ccgcgctggg taccggggtg tgtgcgtccg acgcgattcc agaacatctc gactaa 1916 9 1914 DNA Rhodobacter sphaeroides 9 atgaccgaca gaccctgcac gccgacgctc gaccgggtga cgctcccggt ggacataaag 60 ggcctcacgg accgtgagtt gcgctcgctg gccgacgagc tgcgggccga aacgatctcg 120 gccgtgtcgg tgacgggcgg gcatctgggc gcaggcctcg gcgtggtgga gttgacggtt 180 gcgctgcatg cgatcttcga tgcgccccgc gacaagatca tctgggacgt gggccaccag 240 tgctaccccc acaagatcct gaccgggcgg cgcgaccgca tccgcaccct gcggcagggc 300 gggggtctct cgggcttcac caagcgctcc gagagcccct atgactgttt cggcgcgggc 360 cattcctcga cctcgatctc ggccgcggtg ggctttgccg cggcacgcga gatgggcggc 420 gacacgggcg acgcggtggc ggtgatcggc gacggctcga tgtcggccgg catggccttc 480 gaggcgctga accacggcgg gcacctgaag aaccgggtga tcgtgatcct gaacgacaac 540 gagatgagca tcgcgccgcc ggtgggggcg ctgtcgtcct atctctcgcg gctctatgcg 600 ggcgcgccgt tccaggactt caaggcggcc gccaagggag cgctcgggct tctgcccgaa 660 ccgttccagg agggcgcgcg ccgcgccaag gagatgctga agagcgtcac cgtcggcggc 720 acgctcttcg aggagctggg tttctcctat gtcggcccga tcgacgggca cgatctcgac 780 cagcttctgc cggtgctgcg gaccgtcaag cagcgggcgc atgcgccggt gctgatccat 840 gtcatcacca agaagggcag gggctatgct ccggccgagg ccgcgcgcga ccgtggccat 900 gccacgaaca agttcaacgt cctgaccggc gcgcaggtga agccggtctc gaacgccccc 960 tcctatacca aggtcttcgc ccagagcctc atcaaggagg ccgaggtcga cgagcggatc 1020 tgcgcggtga cggccgccat gccggacggg acggggctca acctcttcgg cgagcggttt 1080 ccgaagcgca ccttcgatgt gggcatcgcg gaacagcatg cggtgacctt ctcggcggcg 1140 cttgcggcag gcggcatgcg gcccttctgc gccatctatt ccaccttcct ccagcgcggc 1200 tacgaccaga tcgtgcatga cgtggcaatc cagcgcctgc cggtgcgctt tgccatcgac 1260 cgcgccggcc tcgtgggggc ggacggcgcc acccatgcgg gctcgttcga tgtggccttc 1320 ctgtcgaacc tgcccggcat cgtggtgatg gccgccgccg acgaggccga gctcgtccat 1380 atggtagcca ccgccgccgc ccatgacgaa gggcccatcg ccttccgcta tccgcgcggc 1440 gacggcgtgg gggtcgaggt gccggtgaag ggcgtgccgc tccagatcgg ccgtggccgg 1500 gtggtgagcg agggcacgcg aatcgcgctc ctgtccttcg gcacccgtct ggccgaggtg 1560 caggtggccg ccgaggcgct ggctgcgcgc gggatctctc ccacggttgc ggatgcgcgc 1620 tttgcaaagc cgctcgaccg ggatctgatc ctgcagctcg cggcccatca cgaggcgctc 1680 attaccatcg aggagggcgc catcggcggc ttcggcagcc atgtggcgca gcttctggcc 1740 gaggccgggg tcttcgaccg cggcttccgg tatcgctcga tggtgctgcc cgacacgttc 1800 atcgaccaca acagcgccga agtgatgtat gccaccgccg ggctgaatgc ggccgacata 1860 gagcggaagg cgctggagac gctgggggtg gaggtcctcg cccgccgcgc ctga 1914 10 1947 DNA Rhodobacter sphaeroides 10 atgaccaatc ccaccccgcg acccgaaacc ccgcttttgg atcgcgtctg ctgcccggcc 60 gacatgaagg cgctgagtga cgccgaactg gagcggctgg ccgacgaagt gcgttccgag 120 gtgatttcgg tcgttgccga gacgggagga catctggggt cctcgctggg ggtggttgag 180 ctgactgtcg cgctgcatgc ggtcttcaac acgcccaccg acaagctcgt ctgggacgtg 240 ggccaccagt gctaccccca caagatcctc accggccggc gcgagcagat gcgcaccctg 300 cgccagaagg gcggcctctc gggcttcacc aagcgctcgg aatccgccta cgacccgttc 360 ggcgcggctc attcctcgac ctcgatctcg gccgcgctcg gctttgccat gggtcgcgag 420 ctgggccagc ccgtgggcga cacgatcgcc gtgatcggcg acggctccat caccgcgggc 480 atggcctacg aggcactgaa ccacgcgggc catctgaaca agcgcctgtt cgtgatcctg 540 aacgacaatg acatgagcat cgcgccgccc gtgggggcgc ttgcgcgcta tctcgtgaat 600 ctctcctcga aggcgccctt cgccacgctg cgcgcggccg ccgacgggct cgaggcctcg 660 ctgccggggc cgctccgcga cggggcgcgc cgggcgcgcc agctcgtgac cgggatgccg 720 ggcgggggca cgctcttcga ggagctgggc ttcacctatg tcggccccat cgacggccac 780 gacatggagg cgctcctcca gacgctgcgc gcggcgcggg cccggaccac ggggccggtg 840 ctcatccatg tggtcacgaa gaagggcaag ggttacgccc ccgccgagaa tgcccccgac 900 aagtatcacg gggtgaacaa gttcgacccc gtcacgggcg agcagaagaa gtcggtggcc 960 aacgcgccga actacaccaa ggtcttcggc tccaccctga ccgaggaggc cgcgcgcgat 1020 ccgcgcatcg tggcgatcac cgccgctatg ccctcgggca ccggcgtcga catcatgcag 1080 aagcgtttcc cgaaccgcgt cttcgacgtg ggcatcgccg agcagcatgc cgtgaccttc 1140 gcggccggcc tcgccggggc cgggatgaag cccttctgcg cgatctattc ctcgttcctg 1200 caacggggtt acgaccagat cgcccatgac gtggcgctgc agaaccttcc cgtccgcttc 1260 gtgatcgacc gggcggggct cgtgggggcc gatggcgcga cccatgcggg ggccttcgac 1320 gttggcttca tcacttcgct gcccaacatg accgtgatgg ccgcggccga cgaggccgag 1380 ctcatccaca tgatcgccac cgccgtggcc ttcggcgagg gccccatcgc cttccgcttc 1440 ccgcggggcg agggggtggg cgtcgagatg cccgagcgcg ggacggtgct ggagcccggc 1500 cggggccgcg tggtgcgcga agggacggat gtcgcgatcc tctccttcgg cgcgcatctg 1560 cacgaggcct tgcaggcggc gaaacttctc gaggccgagg gggtgagcgt gaccgtggcc 1620 gacgcccgct tctcgcgccc gctcgacacg gggcacatcg accagctcgt gcgccatcac 1680 gcggcgctgg taacggtgga gcagggggcc atgggcggct tcggcgccta tgtcatgcac 1740 tgtctcgcca attccggcgg cttcgacggg ggcctcgcgc tccgggtcat gacgctgccc 1800 gaccgcttca tcgagcaggc gagccccgag gacatgtatg ccgatgcggg gctgcgggcc 1860 gaggatatcg cggccaccgc gcggggcgcg ctcgcccggg ggcgcgtgat gccgctccgg 1920 cagacggcaa agccgcgggc ggtctga 1947 11 1911 DNA Synechococcus sp. PC 6301 11 atgcatctca gcgaaattac ccatcccaac cagctccacg ggttgtcggt tgctcagctt 60 gagcaaattg gccaccagat tcgtgagaag cacctgcaga cggttgcagc gaccggtggg 120 cacctcgggc cgggcttggg cgtggtggaa ttgaccctag cgctttacca aacgctcgat 180 ctcgatcgcg acaaagtggt ttgggacgtt ggccaccaag cctatcccca caagctgctg 240 acagggcgct atcacaactt ccataccttg cggcaaaagg atggcattgc gggctacccg 300 aagcgcacgg aaaaccgctt cgatcatttc ggtgccggtc acgcttccac cagtatttct 360 gctggcctcg gtatggctct agcacgggat gcccagggcg aagactaccg atgtgtcgct 420 gtgattggtg atggatcgct caccggtggc atggccttgg aagccatcaa ccacgctggt 480 cacttgccca aaacacggct gttggtcgtg ctcaacgaca atgacatgtc gatctcgccc 540 aacgtgggtg cgctctctcg ctatctgaat aagattcggg ttagtgagcc gatgcagttg 600 ctcaccgatg gtttgaccca ggggatgcaa caaattccct tcgtcggcgg cgccattacc 660 caaggctttg agccggttaa ggaaggcatg aagcgcctct cctacagcaa gattggggcg 720 gtctttgaag agctgggctt cacctacatg gggccagtgg atggtcacaa ccttgaagaa 780 ctgatcgcca ccttccgcga agcgcacaaa cacaccggac cagtactcgt ccacgttgcc 840 acaaccaagg gtaagggcta tccctacgct gaagaagatc aggttggcta tcatgcccaa 900 aatccctttg atctggcgac agggaaggct aaaccagctt caaaaccgaa gccgcctagc 960 tattccaaag tgtttggcca aaccctgacg accttggcca agagcgatcg ccgcattgtc 1020 gggattacgg ctgcgatggc gacaggcacc ggcttggaca ttctccagaa ggcgctgccg 1080 aagcaataca tcgatgttgg cattgccgaa cagcacgccg tggtgctagc tgccggtatg 1140 gcctgcgatg gcatgcgtcc ggtggtggca atctattcca ccttcctgca gcgggccttt 1200 gatcaagtca tccacgacgt ttgtatccaa aagctgcccg tcttcttctg cctcgatcgc 1260 gcggggatag ttggcgcgga tggcccgact caccaaggca tgtacgacat tgcttacctg 1320 cggctgattc ccaacatggt gctgatggca ccgaaagatg aggccgaact gcagcggatg 1380 ctagtgacgg gtattgaata cgacggcccg atcgccatgc gtttcccgcg cgggaatggt 1440 attggcgtac ccctgccgga agaaggctgg gagtcgctcc cgattgggaa agcagagcaa 1500 ctgcgccaag gcgatgattt gctgatgttg gcttacggct cgatggtcta tccggccctg 1560 cagacggcag aactgctgaa tgagcacggc atctcagcta ctgtgatcaa tgcccgcttc 1620 gccaagccct tagatgagga actgattgtg ccgctggcgc gccagatcgg caaagtcgtc 1680 acctttgagg aaggctgcct acccggcggc tttggctccg cgattatgga gtccttgcag 1740 gcccatgatc tgcaggttcc ggtgttgccg atcggtgttc ccgatctctt ggtggaacat 1800 gccagccctg atgaatctaa acaggagttg ggcctgacgc cgcgtcagat ggccgatcgc 1860 atcctcgaaa agtttggaag ccgtcaacgg attggtgctg cttcggcttg a 1911 12 1863 DNA Escherichia coli 12 atgagttttg atattgccaa atacccgacc ctggcactgg tcgactccac ccaggagtta 60 cgactgttgc cgaaagagag tttaccgaaa ctctgcgacg aactgcgccg ctatttactc 120 gacagcgtga gccgttccag cgggcacttc gcctccgggc tgggcacggt cgaactgacc 180 gtggcgctgc actatgtcta caacaccccg tttgaccaat tgatttggga tgtggggcat 240 caggcttatc cgcataaaat tttgaccgga cgccgcgaca aaatcggcac catccgtcag 300 aaaggcggtc tgcacccgtt cccgtggcgc ggcgaaagcg aatatgacgt attaagcgtc 360 gggcattcat caacctccat cagtgccgga attggtattg cggttgctgc cgaaaaagaa 420 ggcaaaaatc gccgcaccgt ctgtgtcatt ggcgatggcg cgattaccgc aggcatggcg 480 tttgaagcga tgaatcacgc gggcgatatc cgtcctgata tgctggtgat tctcaacgac 540 aatgaaatgt cgatttccga aaatgtcggc gcgctcaaca accatctggc acagctgctt 600 tccggtaagc tttactcttc actgcgcgaa ggcgggaaaa aagttttctc tggcgtgccg 660 ccaattaaag agctgctcaa acgcaccgaa gaacatatta aaggcatggt agtgcctggc 720 acgttgtttg aagagctggg ctttaactac atcggcccgg tggacggtca cgatgtgctg 780 gggcttatca ccacgctaaa gaacatgcgc gacctgaaag gcccgcagtt cctgcatatc 840 atgaccaaaa aaggtcgtgg ttatgaaccg gcagaaaaag acccgatcac tttccacgcc 900 gtgcctaaat ttgatccctc cagcggttgt ttgccgaaaa gtagcggcgg tttgccgagc 960 tattcaaaaa tctttggcga ctggttgtgc gaaacggcag cgaaagacaa caagctgatg 1020 gcgattactc cggcgatgcg tgaaggttcc ggcatggtcg agttttcacg taaattcccg 1080 gatcgctact tcgacgtggc aattgccgag caacacgcgg tgacctttgc tgcgggtctg 1140 gcgattggtg ggtacaaacc cattgtcgcg atttactcca ctttcctgca acgcgcctat 1200 gatcaggtgc tgcatgacgt ggcgattcaa aagcttccgg tcctgttcgc catcgaccgc 1260 gcgggcattg ttggtgctga cggtcaaacc catcagggtg cttttgatct ctcttacctg 1320 cgctgcatac cggaaatggt cattatgacc ccgagcgatg aaaacgaatg tcgccagatg 1380 ctctataccg gctatcacta taacgatggc ccgtcagcgg tgcgctaccc gcgtggcaac 1440 gcggtcggcg tggaactgac gccgctggaa aaactaccaa ttggcaaagg cattgtgaag 1500 cgtcgtggcg agaaactggc gatccttaac tttggtacgc tgatgccaga agcggcgaaa 1560 gtcgccgaat cgctgaacgc cacgctggtc gatatgcgtt ttgtgaaacc gcttgatgaa 1620 gcgttaattc tggaaatggc cgccagccat gaagcgctgg tcaccgtaga agaaaacgcc 1680 attatgggcg gcgcaggcag cggcgtgaac gaagtgctga tggcccatcg taaaccagta 1740 cccgtgctga acattggcct gccggacttc tttattccgc aaggaactca ggaagaaatg 1800 cgcgccgaac tcggcctcga tgccgctggt atggaagcca aaatcaaggc ctggctggca 1860 taa 1863 13 1914 DNA Neisseria meningitidis 13 atgaacccaa gccccctact cgacctgatt gacagcccgc aagatttgcg ccgtctggac 60 aaaaaacagc tgccgcgcct tgccggcgag ttgcgcacct ttctgctgga atctgtcggg 120 cagaccggcg ggcatttcgc cagcaatttg ggcgcggtcg agctgacggt tgcgctgcac 180 tacgtttaca acacgcccga agacaagctg gtgtgggatg tcggacacca aagctatccg 240 cacaaaattc ttaccggacg taaaaaccag atgcacacca tgcgccaata tggcggtttg 300 gcgggttttc cgaaacgttg cgagtccgag tacgacgcgt tcggcgtggg gcattcctcc 360 acctccatcg gcgcggcgtt gggcatggcg gcggcggaca aacagttggg cagcgaccgc 420 cgcagcgtcg ccatcatcgg cgacggcgcg atgacggcgg gtcaggcgtt tgaagccttg 480 aactgcgcgg gcgatatgga tgtggatttg ctggtcgtcc tcaacgacaa cgaaatgtcg 540 atttccccca acgtcggtgc gttgcccaaa taccttgcca gcaacgtcgt gcgcgatatg 600 cacggactgt tgagtaccgt caaagcgcaa acgggcaagg tattagacaa aatacccggc 660 gcgatggagt ttgcccaaaa agtcgaacat aaaatcaaaa cccttgccga agaagccgaa 720 cacgccaaac agtcactgtc tttgtttgaa aacttcggct tccgctatac cggccccgtg 780 gacggacaca acgtcgaaaa tctggtcgat gtattggaag acctgcgcgg acgcaaaggc 840 ccgcagcttc tgcacgtcat caccaaaaag ggcaacggct acaaactcgc cgaaaacgat 900 cccgtcaaat accacgccgt cgccaacctg cctaaagaaa gcgcggcgca aatgccgtct 960 gaaaaagaac ccaagcccgc cgccaaaccg acctataccc aagtgttcgg caaatggctg 1020 tgcgaccggg cggcggcaga ttcccgactg gttgcgatta cccccgccat gcgcgagggc 1080 agcggcttgg ttgagtttga acaacgattc cccgaccgct atttcgatgt cggcatcgcc 1140 gagcagcacg ccgttacctt tgccggcggt ttggcttgcg aagggatgaa gcccgtcgtg 1200 gcgatttatt ccaccttttt acaacgcgcc tacgaccaac tggtgcacga catcgccctg 1260 caaaacctgc ccgttttgtt tgccgtcgac cgcgcgggca tcgtcggcgc ggacggcccg 1320 acccatgccg gtttgtacga tttaagcttt ttgcgctgca ttccgaatat gattgtcgcc 1380 gcgccgagcg atgaaaatga atgccgcctg ctgctttcga cctgctatca ggcagacgcg 1440 cccgccgccg tccgctatcc gcgcggcacg ggtacgggcg tgccggtttc agacggcatg 1500 gaaaccgtgg aaatcggcaa gggcattatc cgccgcgaag gtgagaaaac cgcattcatt 1560 gccttcggca gtatggtcgc ccctgcattg gcggtcgccg gaaaactgaa cgccaccgtc 1620 gccgatatgc gcttcgtcaa accgatagac gaagagttga ttgtccgcct tgcccgaagc 1680 cacgaccgca tcgttaccct tgaagaaaac gccgaacagg gcggcgcagg cagcgcggtg 1740 ctggaagtgt tggcgaaaca cggcatctgc aaacccgtct tgcttttggg cgttgccgat 1800 accgtaaccg gacacggcga tccgaaaaaa cttttagacg atttgggctt gagtgccgaa 1860 gcggtggaac ggcgtgtgcg cgcgtggctg tcggatcggg atgcggcaaa ttaa 1914 14 1878 DNA Haemophilus influenzae 14 atgactaaca atatgaacaa ttatcctctt ttatctttaa ttaattctcc agaagatttg 60 cgtcttttaa ataaagatca gctaccacaa ctctgtcaag aattacgtgc ttatctttta 120 gaatctgtta gtcaaactag cggacattta gcgtcaggtt taggcactgt agagctaacc 180 gttgcgctgc attatgtata taagacgcca tttgatcagt taatttggga tgtgggacat 240 caagcttatc cacataaaat cctaacgggt cgccgagagc aaatgtccac aattcgccaa 300 aaagacggta ttcatccttt tccttggcgt gaagaaagtg aatttgatgt attaagtgtt 360 ggtcactcct ctacgtctat tagtgcggga ttaggcattg ccgttgccgc agaacgagaa 420 aatgcaggta gaaaaacagt atgcgtaatc ggtgatggcg caattactgc gggaatggca 480 tttgaggcat taaatcacgc gggggcattg catacagata tgttagttat tttaaatgat 540 aacgaaatgt ctatttcaga aaacgttggt gcattaaata atcatcttgc gcgtattttc 600 tctggctctc tttactctac gcttcgtgat ggcagtaaaa aaatccttga taaagttcct 660 ccaatcaaaa attttatgaa aaaaaccgaa gaacatatga aaggtgtaat gttttcgcca 720 gaaagtacat tatttgaaga actcggtttt aactatattg gcccagtgga tgggcataac 780 attgatgaat tagtggctac gcttacgaat atgcgtaatc tgaaaggccc acaatttttg 840 catataaaaa cgaaaaaagg taaaggatac gcacccgcag aaaaagatcc gattggtttc 900 cacggtgtac ctaaatttga tccaatcagt ggcgaattgc ccaaaaacaa tagtaaacca 960 acttattcga aaatttttgg cgattggcta tgtgaaatgg cagaaaaaga tgccaaaatt 1020 ataggtatca cacctgcaat gcgtgagggt tcaggtatgg tagaattttc ccaacgcttc 1080 ccaaaacaat attttgacgt agcgattgca gaacagcacg ctgtcacgtt tgccacagga 1140 cttgcaattg gcggatataa acctgtcgtc gcaatttact cgacattttt acaacgtgct 1200 tacgatcaat taattcacga tgttgccatt caaaatctcc ctgtgctatt tgcaattgat 1260 cgagcaggga tagttggtgc agatggggct acacatcaag gtgcattcga tattagcttt 1320 atgcgttgca ttccaaatat gatcattatg acgccgagtg atgaaaatga atgccgtcaa 1380 atgctctata caggttatca atgtggaaaa cctgcggcag tgcgctaccc tcgcggaaat 1440 gccgttggtg taaaacttac tcctttagaa atgcttccta ttggtaaatc acgtttaatt 1500 cgaaaaggtc aaaaaattgc gattttaaat tttggtactc tattaccatc cgctttagag 1560 ttatcagaaa aactcaatgc aacggttgtc gatatgcgtt ttgtgaaacc gattgatatt 1620 gaaatgatta atgtgcttgc acaaactcac gattatttgg tcacattgga agaaaatgca 1680 attcaaggtg gagcgggatc tgctgttgcg gaagtactaa attcatcagg aaaatcaacc 1740 gcacttttac aacttggctt gccagattat tttattccac aagcgacaca gcaagaagca 1800 ttggcagatt taggattgga tacaaaaggc attgaagaaa aaattctcaa ctttattgca 1860 aaacaaggta atttataa 1878 15 627 PRT Pseudomonas aeruginosa 15 Met Pro Lys Thr Leu His Glu Ile Pro Arg Glu Arg Pro Ala Thr Pro 1 5 10 15 Leu Leu Asp Arg Ala Ser Ser Pro Ala Glu Leu Arg Arg Leu Gly Glu 20 25 30 Ala Asp Leu Glu Thr Leu Ala Asp Glu Leu Arg Gln Tyr Leu Leu Tyr 35 40 45 Thr Val Gly Gln Thr Gly Gly His Phe Gly Ala Gly Leu Gly Val Val 50 55 60 Glu Leu Thr Ile Ala Leu His Tyr Val Phe Asp Thr Pro Asp Asp Arg 65 70 75 80 Leu Val Trp Asp Val Gly His Gln Ala Tyr Pro His Lys Ile Leu Thr 85 90 95 Glu Arg Arg Glu Leu Met Gly Thr Leu Arg Gln Lys Asn Gly Leu Ala 100 105 110 Ala Phe Pro Arg Arg Ala Glu Ser Glu Tyr Asp Thr Phe Gly Val Gly 115 120 125 His Ser Ser Thr Ser Ile Ser Ala Ala Leu Gly Met Ala Ile Ala Ala 130 135 140 Arg Leu Gln Gly Lys Glu Arg Lys Ser Val Ala Val Ile Gly Asp Gly 145 150 155 160 Ala Leu Thr Ala Gly Met Ala Phe Glu Ala Leu Asn His Ala Ser Glu 165 170 175 Val Asp Ala Asp Met Leu Val Ile Leu Asn Asp Asn Asp Met Ser Ile 180 185 190 Ser His Asn Val Gly Gly Leu Ser Asn Tyr Leu Ala Lys Ile Leu Ser 195 200 205 Ser Arg Thr Tyr Ser Ser Met Arg Glu Gly Ser Lys Lys Val Leu Ser 210 215 220 Arg Leu Pro Gly Ala Trp Glu Ile Ala Arg Arg Thr Glu Glu Tyr Ala 225 230 235 240 Lys Gly Met Leu Val Pro Gly Thr Leu Phe Glu Glu Leu Gly Trp Asn 245 250 255 Tyr Ile Gly Pro Ile Asp Gly His Asp Leu Pro Thr Leu Val Ala Thr 260 265 270 Leu Arg Asn Met Arg Asp Met Lys Gly Pro Gln Phe Leu His Val Val 275 280 285 Thr Lys Lys Gly Lys Gly Phe Ala Pro Ala Glu Leu Asp Pro Ile Gly 290 295 300 Tyr His Ala Ile Thr Lys Leu Glu Ala Pro Gly Ser Ala Pro Lys Lys 305 310 315 320 Thr Gly Gly Pro Lys Tyr Ser Ser Val Phe Gly Gln Trp Leu Cys Asp 325 330 335 Met Ala Ala Gln Asp Ala Arg Leu Leu Gly Ile Thr Pro Ala Met Lys 340 345 350 Glu Gly Ser Asp Leu Val Ala Phe Ser Glu Arg Tyr Pro Glu Arg Tyr 355 360 365 Phe Asp Val Ala Ile Ala Glu Gln His Ala Val Thr Leu Ala Ala Gly 370 375 380 Met Ala Cys Glu Gly Met Lys Pro Val Val Ala Ile Tyr Ser Thr Phe 385 390 395 400 Leu Gln Arg Ala Tyr Asp Gln Leu Ile His Asp Val Ala Val Gln His 405 410 415 Leu Asp Val Leu Phe Ala Ile Asp Arg Ala Gly Leu Val Gly Glu Asp 420 425 430 Gly Pro Thr His Ala Gly Ser Phe Asp Ile Ser Tyr Leu Arg Cys Ile 435 440 445 Pro Gly Met Leu Val Met Thr Pro Ser Asp Glu Asp Glu Leu Arg Lys 450 455 460 Leu Leu Thr Thr Gly Tyr Leu Phe Asp Gly Pro Ala Ala Val Arg Tyr 465 470 475 480 Pro Arg Gly Ser Gly Pro Asn His Pro Ile Asp Pro Asp Leu Gln Pro 485 490 495 Val Glu Ile Gly Lys Gly Val Val Arg Arg Arg Gly Gly Arg Val Ala 500 505 510 Leu Leu Val Phe Gly Val Gln Leu Ala Glu Ala Met Lys Val Ala Glu 515 520 525 Ser Leu Asp Ala Thr Val Val Asp Met Arg Phe Val Lys Pro Leu Asp 530 535 540 Glu Ala Leu Val Arg Glu Leu Ala Gly Ser His Glu Leu Leu Val Thr 545 550 555 560 Ile Glu Glu Asn Ala Val Met Gly Gly Ala Gly Ser Ala Val Gly Glu 565 570 575 Phe Leu Ala Ser Glu Gly Leu Glu Val Pro Leu Leu Gln Leu Gly Leu 580 585 590 Pro Asp Tyr Tyr Val Glu His Ala Lys Pro Ser Glu Met Leu Ala Glu 595 600 605 Cys Gly Leu Asp Ala Ala Gly Ile Glu Lys Ala Val Arg Gln Arg Leu 610 615 620 Asp Arg Gln 625 16 1896 DNA Streptomyces sp. CL190 16 gtgacgattc tggagaacat ccggcaacca cgcgacctga aggcgctgcc cgaggagcag 60 ctgcacgaac tgtccgagga gatcaggcag ttcctggtgc acgcggtcac cagaaccggc 120 ggtcatctgg gacccaacct gggggtggtg gagctgacca tcgccctgca ccgggtcttc 180 gagtcgcccg tcgaccgcat cctgtgggac accggccacc agagctacgt acacaagctg 240 ctgacgggac gtcaggactt ctccaagctg cgcggcaagg gcggcctgtc cggctacccc 300 tcgcgcgagg agtccgagca cgacgtcatc gagaacagcc acgcctccac cgccctcggc 360 tgggccgacg gactcgccaa ggcccgccgg gtgcaggggg agaagggcca tgtcgtcgcc 420 gtcatcggcg gacgggcgct gaccggcggc atggcctggg aggccctgaa caacatcgcg 480 gccgccaagg accagccgct gatcatcgtc gtcaacgaca acgagcgctc ctacgcgccc 540 accatcggcg gcctcgccaa ccacctggcc accctgcgca ccaccgacgg ctacgagaag 600 gtcctcgcct ggggcaagga cgtcctgctg cgtaccccca tcgtcggcca ccccctctac 660 gaggccctgc acggcgccaa gaagggcttc aaggacgcct tcgccccgca gggcatgttc 720 gaggacctgg gcctgaagta cgtcggcccc atcgacgggc acgacatcgg cgcggtcgag 780 tccgcgctgc gccgcgccaa gcgcttccac gggccggtgc tggtgcactg cctcaccgtc 840 aagggccgcg gctacgaacc cgccctcgcc cacgaggagg accacttcca caccgtcggc 900 gtgatggacc cgctcacctg tgagcccctc tcgcccaccg acggcccgtc ctggacctcg 960 gtgttcggcg acgagatcgt acggatcggc gcggagcgcg aggacatcgt cgcgatcacc 1020 gccgcgatgc tccacccggt ggggctcgcc aggttcgccg accgcttccc ggaccgggtc 1080 tgggacgtcg gcatcgccga gcagcacgcg gccgtgtccg cggccgggct cgccaccggc 1140 ggactgcacc cggtcgtcgc cgtctacgcc accttcctca accgcgcctt cgaccagctc 1200 ctgatggacg tcgccctgca ccgctgcggt gtgaccttcg tcctggaccg ggccggcgtc 1260 acgggcgtcg acggcgcctc gcacaacggc atgtgggaca tgtccgtcct ccaggtcgtg 1320 cccggcctca ggatcgccgc cccgcgcgac gccgaccacg tgcgcgccca gctgcgggag 1380 gcggtcgccg tggacgacgc gccgacgctg atccgcttcc cgaaggagtc cgtcggcccg 1440 cggatcccgg ccctcgaccg ggtcggcggc ctcgatgtgc tgcaccgcga cgagcggccc 1500 gaggtgctgc tggtcgccgt gggcgtcatg gcacaggtct gcctccagac cgccgagctg 1560 ctccgggccc gcggcatcgg atgcacggtc gtcgacccgc gctgggtcaa gcccgtcgac 1620 cccgtgctgc ccccactcgc cgccgagcac cggctcgtcg ccgtcgtgga ggacaacagc 1680 cgggccgccg gggtcggttc ggcggtcgcc ctggcgctcg gggacgccga tgtcgacgta 1740 ccggtgcgcc gcttcggcat ccccgagcag ttcctcgcgc acgccaggcg cggtgaggtg 1800 ctcgccgaca tcgggctgac cccggtggag atcgccgggc ggatcggcgc gagcctgccc 1860 gtgcgggagg aaccggccga ggagcagccc gcatga 1896 17 1857 DNA Helicobacter pylori 17 gtgattttgc aaaataaaac ttttgattta aaccctaacg atattgcagg cttggagttg 60 gtgtgtcaaa cgctacggaa tcgtatttta gaagtggtga gcgctaatgg ggggcattta 120 agctcttctt taggggctgt ggagctgatt gtgggcatgc atgccttatt tgattgccaa 180 aaaaaccctt tcatttttga cacttcgcac caagcttacg cccacaagct tttaaccggg 240 cgctttgaaa gctttagcac tttaaggcaa ttcaagggtt tgagcggctt tactaaaccc 300 agcgagagcg catacgatta tttcatcgcc gggcatagtt ccacttcggt gtctataggc 360 gttggggtgg ctaaagcttt ttgtttgaaa caagcgctag gcatgcccat agctttatta 420 ggcgatggga gcattagtgc agggattttt tatgaagcct taaacgaact gggcgatagg 480 aaatacccca tgatcatgat tttaaacgat aatgaaatga gtatcagcac gcctattgga 540 gccttatcca aagcccttag ccagctgatg aaaggcccgt tttaccagtc tttccgctct 600 aaagttaaaa aaatcttaag caccttacct gaaagcgtga attacttagc gagtcgtttt 660 gaagaatctt tcaagctcat caccccgggc gtgttttttg aagaattagg cattaactat 720 atagggccta ttaatgggca tgatttgagc gcgattattg aaaccttaaa attagccaaa 780 gagcttaaag agccggtgct aatccatgcg caaaccttaa agggcaaagg ctataagatc 840 gctgaagggc gctatgaaaa atggcatggg gtggggcctt ttgatttgga taccggcttg 900 tctaaaaaat ccaaaagcgc aatcttatcg cccactgaag cgtattctaa caccctttta 960 gaattagcta aaaaagatga aaaaatcgta ggcgtaaccg cggcgatgcc tagcggcaca 1020 ggattagaca aactcattga cgcttaccct ttgcgctttt ttgatgtcgc tatcgctgag 1080 caacacgctt taacttctag cagcgctatg gctaaagagg ggtttaaacc ttttgtgagc 1140 atctattcta cttttttgca gagggcttat gattctattg tgcatgacgc ttgtatttct 1200 agcttgccga ttaaattagc cattgacagg gctgggattg tgggcgaaga tggcgagacg 1260 caccaagggc ttttagacgt gtcgtatttg cgctctatcc ctaacatggt catttttgcc 1320 ccacgagaca atgagacttt aaaaaacgcc gtgcgttttg ccaatgaaca cgattcaagc 1380 ccttgcgcgt tccgataccc tagggggtcg tttgcgttaa aagagggggt ttttgagcct 1440 agcggttttg ttttaggcca aagcgaattg ttgaaaaaag agggcgaaat tttactcata 1500 ggctatggta atggcgtggg gcgggcgcat ttagtccaac tggctttaaa agaaaaaaac 1560 atagaatgcg ctctcttgga tctcaggttt ttaaagcctt tagatccaaa tttaagcgcg 1620 atcgttgccc cttatcaaaa gctctatgtt tttagcgata attacaagct tggaggggtg 1680 gctagcgcga ttttagagtt tttgagcgaa caaaatattt taaagcctgt taaaagcttt 1740 gaaatcattg atgaatttat catgcatggg aacaccgctt tagtggaaaa atccttagga 1800 ttagacacag agagtttgac tgacgctatt ttaaaagatt taggacaaga gagatga 1857 18 628 PRT Aquifex aeolicus 18 Met Leu Glu Lys Tyr Glu Ile Leu Lys Asp Tyr Lys Gly Pro Phe Asp 1 5 10 15 Ile Lys Asn Tyr Asp Tyr Glu Thr Leu Gln Lys Leu Ala Gln Glu Val 20 25 30 Arg Asp Tyr Ile Ile Asn Val Thr Ser Lys Asn Gly Gly His Val Gly 35 40 45 Pro Ser Leu Gly Val Val Glu Leu Thr Ile Ala Leu Leu Arg Val Phe 50 55 60 Asn Pro Pro Glu Asp Val Ile Val Trp Asp Ile Gly His Gln Gly Tyr 65 70 75 80 Pro Trp Lys Ile Leu Thr Asp Arg Lys Glu Gln Phe Pro Thr Leu Arg 85 90 95 Gln Tyr Lys Gly Ile Ser Gly Phe Leu Arg Arg Glu Glu Ser Ile Tyr 100 105 110 Asp Ala Phe Gly Ala Gly His Ser Ser Thr Ser Ile Ser Ala Ala Leu 115 120 125 Gly Phe Arg Ile Gly Lys Asp Leu Lys Gly Glu Lys Glu Asp Tyr Val 130 135 140 Ile Ala Val Ile Gly Asp Gly Ala Leu Thr Ala Gly Met Ala Tyr Glu 145 150 155 160 Ala Leu Asn Asn Ala Gly His Ile Arg Pro Asp Arg Phe Ile Val Ile 165 170 175 Leu Asn Asp Asn Glu Met Ser Ile Ser Pro Asn Val Gly Ala Ile Ser 180 185 190 Thr Tyr Leu Asn Arg Ile Ile Ser Gly His Phe Val Gln Glu Thr Arg 195 200 205 Gln Lys Ile Lys Asn Phe Leu Gln His Phe Gly Glu Thr Pro Leu Arg 210 215 220 Ile Met Lys Leu Thr Glu Glu Phe Leu Lys Gly Leu Ile Ser Pro Gly 225 230 235 240 Val Ile Phe Glu Glu Leu Gly Phe Asn Tyr Ile Gly Pro Ile Asp Gly 245 250 255 His Asp Ile Lys Ala Leu Glu Asp Thr Leu Asn Asn Val Lys Asp Ile 260 265 270 Lys Gly Pro Val Leu Leu His Val Tyr Thr Lys Lys Gly Lys Gly Tyr 275 280 285 Lys Pro Ala Glu Glu Asn Pro Val Lys Trp His Gly Val Ala Pro Tyr 290 295 300 Lys Val Glu Ser Gly Glu Ile Ile Lys Lys Ser Ser Pro Pro Thr Trp 305 310 315 320 Thr Ser Val Phe Gly Lys Ala Leu Val Glu Leu Ala Glu Arg Asp Glu 325 330 335 Lys Ile Val Ala Ile Thr Pro Ala Met Arg Glu Gly Ser Gly Leu Val 340 345 350 Glu Phe Ala Lys Arg Phe Pro Asp Arg Phe Phe Asp Val Gly Ile Ala 355 360 365 Glu Gln His Ala Cys Thr Phe Ala Ala Gly Leu Ala Ala Glu Gly Leu 370 375 380 Arg Pro Val Ala Ala Tyr Tyr Ser Thr Phe Leu Gln Arg Ala Tyr Asp 385 390 395 400 Gln Val Ile His Asp Val Ala Leu Gln Asn Leu Pro Val Thr Phe Ala 405 410 415 Ile Asp Arg Ala Gly Leu Val Gly Asp Asp Gly Pro Thr His His Gly 420 425 430 Val Phe Asp Leu Ser Tyr Leu Arg Cys Val Pro Asn Met Val Val Cys 435 440 445 Ala Pro Lys Asp Glu Gln Glu Leu Arg Asp Leu Leu Tyr Thr Gly Ile 450 455 460 Tyr Ser Gly Lys Pro Phe Ala Leu Arg Tyr Pro Arg Gly Ala Ala Tyr 465 470 475 480 Gly Val Pro Thr Glu Gly Phe Lys Lys Ile Glu Ile Gly Thr Trp Glu 485 490 495 Glu Leu Leu Glu Gly Glu Asp Cys Val Ile Leu Ala Val Gly Tyr Pro 500 505 510 Val Tyr Gln Ala Leu Arg Ala Ala Glu Lys Leu Tyr Lys Glu Gly Ile 515 520 525 Arg Val Gly Val Val Asn Ala Arg Phe Val Lys Pro Met Asp Glu Lys 530 535 540 Met Leu Arg Asp Leu Ala Asn Arg Tyr Asp Thr Phe Ile Thr Val Glu 545 550 555 560 Asp Asn Thr Val Val Gly Gly Phe Gly Ser Gly Val Leu Glu Phe Phe 565 570 575 Ala Arg Glu Gly Ile Met Lys Arg Val Ile Asn Leu Gly Val Pro Asp 580 585 590 Arg Phe Ile Glu His Gly Lys Gln Asp Ile Leu Arg Asn Leu Val Gly 595 600 605 Ile Asp Ala Glu Gly Ile Glu Lys Ala Val Arg Asp Ala Leu Lys Gly 610 615 620 Gly Arg Leu Ile 625 19 633 PRT Bacillus subtilis 19 Met Asp Leu Leu Ser Ile Gln Asp Pro Ser Phe Leu Lys Asn Met Ser 1 5 10 15 Ile Asp Glu Leu Glu Lys Leu Ser Asp Glu Ile Arg Gln Phe Leu Ile 20 25 30 Thr Ser Leu Ser Ala Ser Gly Gly His Ile Gly Pro Asn Leu Gly Val 35 40 45 Val Glu Leu Thr Val Ala Leu His Lys Glu Phe Asn Ser Pro Lys Asp 50 55 60 Lys Phe Leu Trp Asp Val Gly His Gln Ser Tyr Val His Lys Leu Leu 65 70 75 80 Thr Gly Arg Gly Lys Glu Phe Ala Thr Leu Arg Gln Tyr Lys Gly Leu 85 90 95 Cys Gly Phe Pro Lys Arg Ser Glu Ser Glu His Asp Val Trp Glu Thr 100 105 110 Gly His Ser Ser Thr Ser Leu Ser Gly Ala Met Gly Met Ala Ala Ala 115 120 125 Arg Asp Ile Lys Gly Thr Asp Glu Tyr Ile Ile Pro Ile Ile Gly Asp 130 135 140 Gly Ala Leu Thr Gly Gly Met Ala Leu Glu Ala Leu Asn His Ile Gly 145 150 155 160 Asp Glu Lys Lys Asp Met Ile Val Ile Leu Asn Asp Asn Glu Met Ser 165 170 175 Ile Ala Pro Asn Val Gly Ala Ile His Ser Met Leu Gly Arg Leu Arg 180 185 190 Thr Ala Gly Lys Tyr Gln Trp Val Lys Asp Glu Leu Glu Tyr Leu Phe 195 200 205 Lys Lys Ile Pro Ala Val Gly Gly Lys Leu Ala Ala Thr Ala Glu Arg 210 215 220 Val Lys Asp Ser Leu Lys Tyr Met Leu Val Ser Gly Met Phe Phe Glu 225 230 235 240 Glu Leu Gly Phe Thr Tyr Leu Gly Pro Val Asp Gly His Ser Tyr His 245 250 255 Glu Leu Ile Glu Asn Leu Gln Tyr Ala Lys Lys Thr Lys Gly Pro Val 260 265 270 Leu Leu His Val Ile Thr Lys Lys Gly Lys Gly Tyr Lys Pro Ala Glu 275 280 285 Thr Asp Thr Ile Gly Thr Trp His Gly Thr Gly Pro Tyr Lys Ile Asn 290 295 300 Thr Gly Asp Phe Val Lys Pro Lys Ala Ala Ala Pro Ser Trp Ser Gly 305 310 315 320 Leu Val Ser Gly Thr Val Gln Arg Met Ala Arg Glu Asp Gly Arg Ile 325 330 335 Val Ala Ile Thr Pro Ala Met Pro Val Gly Ser Lys Leu Glu Gly Phe 340 345 350 Ala Lys Glu Phe Pro Asp Arg Met Phe Asp Val Gly Ile Ala Glu Gln 355 360 365 His Ala Ala Thr Met Ala Ala Ala Met Ala Met Gln Gly Met Lys Pro 370 375 380 Phe Leu Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala Tyr Asp Gln Val 385 390 395 400 Val His Asp Ile Cys Arg Gln Asn Ala Asn Val Phe Ile Gly Ile Asp 405 410 415 Arg Ala Gly Leu Val Gly Ala Asp Gly Glu Thr His Gln Gly Val Phe 420 425 430 Asp Ile Ala Phe Met Arg His Ile Pro Asn Met Val Leu Met Met Pro 435 440 445 Lys Asp Glu Asn Glu Gly Gln His Met Val His Thr Ala Leu Ser Tyr 450 455 460 Asp Glu Gly Pro Ile Ala Met Arg Phe Pro Arg Gly Asn Gly Leu Gly 465 470 475 480 Val Lys Met Asp Glu Gln Leu Lys Thr Ile Pro Ile Gly Thr Trp Glu 485 490 495 Val Leu Arg Pro Gly Asn Asp Ala Val Ile Leu Thr Phe Gly Thr Thr 500 505 510 Ile Glu Met Ala Ile Glu Ala Ala Glu Glu Leu Gln Lys Glu Gly Leu 515 520 525 Ser Val Arg Val Val Asn Ala Arg Phe Ile Lys Pro Ile Asp Glu Lys 530 535 540 Met Met Lys Ser Ile Leu Lys Glu Gly Leu Pro Ile Leu Thr Ile Glu 545 550 555 560 Glu Ala Val Leu Glu Gly Gly Phe Gly Ser Ser Ile Leu Glu Phe Ala 565 570 575 His Asp Gln Gly Glu Tyr His Thr Pro Ile Asp Arg Met Gly Ile Pro 580 585 590 Asp Arg Phe Ile Glu His Gly Ser Val Thr Ala Leu Leu Glu Glu Ile 595 600 605 Gly Leu Thr Lys Gln Gln Val Ala Asn Arg Ile Arg Leu Leu Met Pro 610 615 620 Pro Lys Thr His Lys Gly Ile Gly Ser 625 630 20 735 PRT Chlamydomonas reinhardtii 20 Met Leu Arg Gly Ala Val Ser His Gly Pro Ala Val Ala Asp Arg Ala 1 5 10 15 Ala Ala Gly Pro Ala Arg Cys Ala Ala Pro Val Ala Arg Gly Val Arg 20 25 30 Ser Ala Ala Pro Thr Arg Gln Arg Arg Ala Glu Ala Ser Val Asn Ala 35 40 45 Pro Arg Ala Gly Pro Ala Gly Ser Tyr Ser Gly Glu Trp Asp Lys Leu 50 55 60 Ser Val Glu Glu Ile Asp Glu Trp Arg Asp Val Gly Pro Lys Thr Pro 65 70 75 80 Leu Leu Asp Thr Val Asn Tyr Pro Val His Leu Lys Asn Phe Asn Asn 85 90 95 Glu Gln Leu Lys Gln Leu Cys Lys Glu Leu Arg Ser Asp Ile Val His 100 105 110 Thr Val Ser Arg Thr Gly Gly His Leu Ser Ser Ser Leu Gly Val Val 115 120 125 Glu Leu Thr Val Ala Met His Tyr Val Phe Asn Thr Pro Glu Asp Lys 130 135 140 Ile Ile Trp Asp Val Gly His Gln Ala Tyr Gly His Lys Ile Leu Thr 145 150 155 160 Gly Arg Arg Lys Gly Met Ala Thr Ile Arg Gln Thr Asn Gly Leu Ser 165 170 175 Gly Phe Thr Lys Arg Asp Glu Ser Glu Tyr Asp Pro Phe Gly Ala Gly 180 185 190 His Ser Ser Thr Ser Ile Ser Ala Ala Leu Gly Met Ala Val Gly Arg 195 200 205 Asp Val Lys Gly Lys Lys Asn Ser Val Ile Ala Val Ile Gly Asp Gly 210 215 220 Ala Ile Thr Gly Gly Met Ala Tyr Glu Ala Met Asn His Ala Gly Phe 225 230 235 240 Leu Asp Lys Asn Met Ile Val Ile Leu Asn Asp Asn Gln Gln Val Ser 245 250 255 Leu Pro Thr Gln Tyr Asn Asn Lys Asn Gln Asp Pro Val Gly Ala Leu 260 265 270 Ser Ser Ala Leu Ala Arg Leu Gln Ala Asn Arg Pro Leu Arg Glu Leu 275 280 285 Arg Glu Ile Ala Lys Gly Val Thr Lys Gln Leu Pro Asp Val Val Gln 290 295 300 Lys Ala Thr Ala Lys Ile Asp Glu Tyr Ala Arg Gly Met Ile Ser Gly 305 310 315 320 Thr Gly Ser Thr Leu Phe Glu Glu Leu Gly Leu Tyr Tyr Ile Gly Pro 325 330 335 Val Asp Gly His Asn Leu Asp Asp Leu Ile Ala Val Leu Ser Glu Val 340 345 350 Arg Ser Ala Glu Thr Val Gly Pro Val Leu Val His Val Val Thr Glu 355 360 365 Lys Gly Arg Gly Tyr Leu Pro Ala Glu Thr Ala Gln Asp Lys Met His 370 375 380 Gly Val Val Lys Phe Asp Pro Arg Thr Gly Lys Gln Val Gln Ala Lys 385 390 395 400 Thr Lys Ala Met Ser Tyr Thr Asn Tyr Phe Ala Asp Ala Leu Thr Ala 405 410 415 Glu Ala Glu Arg Asp Ser Arg Ile Val Ala Val His Ala Ala Met Ala 420 425 430 Gly Gly Thr Gly Leu Tyr Arg Phe Glu Lys Lys Phe Pro Asp Arg Thr 435 440 445 Phe Asp Val Gly Ile Ala Glu Gln His Ala Val Thr Phe Ala Ala Gly 450 455 460 Leu Ala Cys Glu Gly Leu Val Pro Phe Cys Thr Ile Tyr Ser Thr Phe 465 470 475 480 Met Gln Arg Gly Tyr Asp Gln Ile Val His Asp Val Ser Leu Gln Lys 485 490 495 Leu Pro Val Arg Phe Ala Met Asp Arg Ala Gly Leu Val Gly Ala Asp 500 505 510 Gly Ser Thr His Cys Gly Ala Phe Asp Val Thr Phe Met Ala Ser Leu 515 520 525 Pro His Met Ile Thr Met Ala Pro Ser Asn Glu Ala Glu Leu Ile Asn 530 535 540 Met Val Ala Thr Cys Ala Ala Ile Asp Asp Ala Pro Ser Cys Phe Arg 545 550 555 560 Phe Pro Arg Gly Asn Gly Leu Gly Leu Asp Leu Ala Ala Tyr Gly Ile 565 570 575 Ser Lys Asp Leu Lys Gly Val Pro Leu Glu Val Gly Lys Gly Val Val 580 585 590 Arg Arg Gln Gly Lys Asp Val Cys Leu Val Ala Tyr Gly Ser Ser Val 595 600 605 Asn Glu Ala Leu Ala Ala Ala Asp Met Leu Glu Arg Asp Gly Val Ser 610 615 620 Thr Thr Val Ile Asp Ala Arg Phe Cys Lys Pro Leu Asp Thr Lys Leu 625 630 635 640 Ile Arg Ser Ala Ala Lys Glu His Pro Val Met Ile Thr Ile Glu Glu 645 650 655 Gly Ser Val Gly Gly Phe Ala Ala His Val Met Gln Phe Leu Ala Leu 660 665 670 Glu Gly Leu Leu Asp Gly Gly Leu Lys Phe Arg Pro Met Thr Leu Pro 675 680 685 Asp Arg Tyr Ile Asp His Gly Asp Tyr Arg Asp Gln Leu Ala Met Ala 690 695 700 Gly Leu Thr Ser Gln His Ile Ala Ser Thr Ala Leu Thr Thr Leu Gly 705 710 715 720 Arg Ala Lys Asp Ala Ala Lys Phe Ser Leu Ser Ala Leu Gln Ala 725 730 735 21 615 PRT Campylobacter jejuni 21 Met Ser Lys Lys Phe Ala His Thr Gln Glu Glu Leu Glu Lys Leu Ser 1 5 10 15 Leu Lys Glu Leu Glu Asn Leu Ala Ala Ser Met Arg Glu Lys Ile Ile 20 25 30 Gln Val Val Ser Lys Asn Gly Gly His Leu Ser Ser Asn Leu Gly Ala 35 40 45 Val Glu Leu Ser Ile Ala Met His Leu Val Phe Asp Ala Lys Lys Asp 50 55 60 Pro Phe Ile Phe Asp Val Ser His Gln Ser Tyr Thr His Lys Leu Leu 65 70 75 80 Ser Gly Lys Glu Glu Ile Phe Asp Thr Leu Arg Gln Ile Asn Gly Leu 85 90 95 Ser Gly Tyr Thr Lys Pro Ser Glu Gly Asp Tyr Phe Val Ala Gly His 100 105 110 Ser Ser Thr Ser Ile Ser Leu Ala Val Gly Ala Cys Lys Ala Ile Ala 115 120 125 Leu Lys Gly Glu Lys Arg Ile Pro Val Ala Leu Ile Gly Asp Gly Ala 130 135 140 Leu Ser Ala Gly Met Ala Tyr Glu Ala Leu Asn Glu Leu Gly Asp Ser 145 150 155 160 Lys Phe Pro Cys Val Ile Leu Leu Asn Asp Asn Glu Met Ser Ile Ser 165 170 175 Lys Pro Ile Gly Ala Ile Ser Lys Tyr Leu Ser Gln Ala Met Ala Thr 180 185 190 Gln Phe Tyr Gln Ser Phe Lys Lys Arg Ile Ala Lys Met Leu Asp Ile 195 200 205 Leu Pro Asp Ser Ala Thr Tyr Met Ala Lys Arg Phe Glu Glu Ser Phe 210 215 220 Lys Leu Ile Thr Pro Gly Leu Leu Phe Glu Glu Leu Gly Leu Glu Tyr 225 230 235 240 Ile Gly Pro Ile Asp Gly His Asn Leu Gly Glu Ile Ile Ser Ala Leu 245 250 255 Lys Gln Ala Lys Ala Met Gln Lys Pro Cys Val Ile His Ala Gln Thr 260 265 270 Ile Lys Gly Lys Gly Tyr Ala Leu Ala Glu Gly Lys His Ala Lys Trp 275 280 285 His Gly Val Gly Ala Phe Asp Ile Asp Ser Gly Glu Ser Val Lys Lys 290 295 300 Ser Asp Thr Lys Lys Ser Ala Thr Glu Ile Phe Ser Lys Asn Leu Leu 305 310 315 320 Asp Leu Ala Ser Lys Tyr Glu Asn Ile Val Gly Val Thr Ala Ala Met 325 330 335 Pro Ser Gly Thr Gly Leu Asp Lys Leu Ile Glu Lys Tyr Pro Asn Arg 340 345 350 Phe Trp Asp Val Ala Ile Ala Glu Gln His Ala Val Thr Ser Met Ala 355 360 365 Ala Met Ala Lys Glu Gly Phe Lys Pro Phe Ile Ala Ile Tyr Ser Thr 370 375 380 Phe Leu Gln Arg Ala Tyr Asp Gln Val Ile His Asp Cys Ala Ile Met 385 390 395 400 Asn Leu Asn Val Val Phe Ala Met Asp Arg Ala Gly Ile Val Gly Glu 405 410 415 Asp Gly Glu Thr His Gln Gly Val Phe Asp Leu Ser Phe Leu Ala Pro 420 425 430 Leu Pro Asn Phe Thr Leu Leu Ala Pro Arg Asp Glu Gln Met Met Gln 435 440 445 Asn Ile Met Glu Tyr Ala Tyr Leu His Gln Gly Pro Ile Ala Leu Arg 450 455 460 Tyr Pro Arg Gly Ser Phe Ile Leu Asp Lys Glu Phe Asn Pro Cys Glu 465 470 475 480 Ile Lys Leu Gly Lys Ala Gln Trp Leu Val Lys Asn Asn Ser Glu Ile 485 490 495 Ala Phe Leu Gly Tyr Gly Gln Gly Val Ala Lys Ala Trp Gln Val Leu 500 505 510 Arg Ala Leu Gln Glu Met Asn Asn Asn Ala Asn Leu Ile Asp Leu Ile 515 520 525 Phe Ala Lys Pro Leu Asp Glu Glu Leu Leu Cys Glu Leu Ala Lys Lys 530 535 540 Ser Lys Ile Trp Phe Ile Phe Ser Glu Asn Val Lys Ile Gly Gly Ile 545 550 555 560 Glu Ser Leu Ile Asn Asn Phe Leu Gln Lys Tyr Asp Leu His Val Lys 565 570 575 Val Val Ser Phe Glu Tyr Glu Asp Lys Phe Ile Glu His Gly Lys Thr 580 585 590 Ser Glu Val Glu Lys Asn Leu Glu Lys Asp Val Asn Ser Leu Leu Thr 595 600 605 Lys Val Leu Lys Phe Tyr His 610 615 22 719 PRT Lycopersicon esculentum 22 Met Ala Leu Cys Ala Tyr Ala Phe Pro Gly Ile Leu Asn Arg Thr Gly 1 5 10 15 Val Val Ser Asp Ser Ser Lys Ala Thr Pro Leu Phe Ser Gly Trp Ile 20 25 30 His Gly Thr Asp Leu Gln Phe Leu Phe Gln His Lys Leu Thr His Glu 35 40 45 Val Lys Lys Arg Ser Arg Val Val Gln Ala Ser Leu Ser Glu Ser Gly 50 55 60 Glu Tyr Tyr Thr Gln Arg Pro Pro Thr Pro Ile Leu Asp Thr Val Asn 65 70 75 80 Tyr Pro Ile His Met Lys Asn Leu Ser Leu Lys Glu Leu Lys Gln Leu 85 90 95 Ala Asp Glu Leu Arg Ser Asp Thr Ile Phe Asn Val Ser Lys Thr Gly 100 105 110 Gly His Leu Gly Ser Ser Leu Gly Val Val Glu Leu Thr Val Ala Leu 115 120 125 His Tyr Val Phe Asn Ala Pro Gln Asp Arg Ile Leu Trp Asp Val Gly 130 135 140 His Gln Ser Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Asp Lys Met 145 150 155 160 Ser Thr Leu Arg Gln Thr Asp Gly Leu Ala Gly Phe Thr Lys Arg Ser 165 170 175 Glu Ser Glu Tyr Asp Cys Phe Gly Thr Gly His Ser Ser Thr Thr Ile 180 185 190 Ser Ala Gly Leu Gly Met Ala Val Gly Arg Asp Leu Lys Gly Arg Asn 195 200 205 Asn Asn Val Ile Ala Val Ile Gly Asp Gly Ala Met Thr Ala Gly Gln 210 215 220 Ala Tyr Glu Ala Met Asn Asn Ala Gly Tyr Leu Asp Ser Asp Met Ile 225 230 235 240 Val Ile Leu Asn Asp Asn Arg Gln Val Ser Leu Pro Thr Ala Thr Leu 245 250 255 Asp Gly Pro Val Ala Pro Val Gly Ala Leu Ser Ser Ala Leu Ser Arg 260 265 270 Leu Gln Ser Asn Arg Pro Leu Arg Glu Leu Arg Glu Val Ala Lys Gly 275 280 285 Val Thr Lys Gln Ile Gly Gly Pro Met His Glu Leu Ala Ala Lys Val 290 295 300 Asp Glu Tyr Ala Arg Gly Met Ile Ser Gly Ser Gly Ser Thr Leu Phe 305 310 315 320 Glu Glu Leu Gly Leu Tyr Tyr Ile Gly Pro Val Asp Gly His Asn Ile 325 330 335 Asp Asp Leu Ile Ala Ile Leu Lys Glu Val Arg Ser Thr Lys Thr Thr 340 345 350 Gly Pro Val Leu Ile His Val Val Thr Glu Lys Gly Arg Gly Tyr Pro 355 360 365 Tyr Ala Glu Arg Ala Ala Asp Lys Tyr His Gly Val Ala Lys Phe Asp 370 375 380 Pro Ala Thr Gly Lys Gln Phe Lys Ala Ser Ala Lys Thr Gln Ser Tyr 385 390 395 400 Thr Thr Tyr Phe Ala Glu Ala Leu Ile Ala Glu Ala Glu Ala Asp Lys 405 410 415 Asp Ile Val Ala Ile His Ala Ala Met Gly Gly Gly Thr Gly Met Asn 420 425 430 Leu Phe His Arg Arg Phe Pro Thr Arg Cys Phe Asp Val Gly Ile Ala 435 440 445 Glu Gln His Ala Val Thr Phe Ala Ala Gly Leu Ala Cys Glu Gly Ile 450 455 460 Lys Pro Phe Cys Ala Ile Tyr Ser Ser Phe Met Gln Arg Ala Tyr Asp 465 470 475 480 Gln Val Val His Asp Val Asp Leu Gln Lys Leu Pro Val Arg Phe Ala 485 490 495 Met Asp Arg Ala Gly Leu Val Gly Ala Asp Gly Pro Thr His Cys Gly 500 505 510 Ala Phe Asp Val Thr Tyr Met Ala Cys Leu Pro Asn Met Val Val Met 515 520 525 Ala Pro Ser Asp Glu Ala Glu Leu Phe His Met Val Ala Thr Ala Ala 530 535 540 Ala Ile Asp Asp Arg Pro Ser Cys Phe Arg Tyr Pro Arg Gly Asn Gly 545 550 555 560 Ile Gly Val Glu Leu Pro Ala Gly Asn Lys Gly Ile Pro Leu Glu Val 565 570 575 Gly Lys Gly Arg Ile Leu Ile Glu Gly Glu Arg Val Ala Leu Leu Gly 580 585 590 Tyr Gly Ser Ala Val Gln Asn Cys Leu Asp Ala Ala Ile Val Leu Glu 595 600 605 Ser Arg Gly Leu Gln Val Thr Val Ala Asp Ala Arg Phe Cys Lys Pro 610 615 620 Leu Asp His Ala Leu Ile Arg Ser Leu Ala Lys Ser His Glu Val Leu 625 630 635 640 Ile Thr Val Glu Glu Gly Ser Ile Gly Gly Phe Gly Ser His Val Val 645 650 655 Gln Phe Met Ala Leu Asp Gly Leu Leu Asp Gly Lys Leu Lys Trp Arg 660 665 670 Pro Ile Val Leu Pro Asp Arg Tyr Ile Asp His Gly Ser Pro Val Asp 675 680 685 Gln Leu Ala Glu Ala Gly Leu Thr Pro Ser His Ile Ala Ala Thr Val 690 695 700 Phe Asn Ile Leu Gly Gln Thr Arg Glu Ala Leu Glu Val Met Thr 705 710 715 23 643 PRT Mycobacterium leprae 23 Met Leu Glu Gln Ile Arg Arg Pro Ala Asp Leu Gln His Leu Ser Gln 1 5 10 15 Gln Gln Leu Arg Asp Leu Ala Ala Glu Ile Arg Glu Leu Leu Val His 20 25 30 Lys Val Ala Ala Thr Gly Gly His Leu Gly Pro Asn Leu Gly Val Val 35 40 45 Glu Leu Thr Leu Ala Leu His Arg Val Phe Asp Ser Pro His Asp Pro 50 55 60 Ile Ile Phe Asp Thr Gly His Gln Ala Tyr Val His Lys Met Leu Thr 65 70 75 80 Gly Arg Cys Gln Asp Phe Asp Ser Leu Arg Lys Lys Ala Gly Leu Ser 85 90 95 Gly Tyr Pro Ser Arg Ala Glu Ser Glu His Asp Trp Val Glu Ser Ser 100 105 110 His Ala Ser Thr Ala Leu Ser Tyr Ala Asp Gly Leu Ala Lys Ala Phe 115 120 125 Glu Leu Ala Gly Asn Arg Asn Arg His Val Val Ala Val Val Gly Asp 130 135 140 Gly Ala Leu Thr Gly Gly Met Cys Trp Glu Ala Leu Asn Asn Ile Ala 145 150 155 160 Ala Thr Pro Arg Pro Val Val Ile Val Val Asn Asp Asn Gly Arg Ser 165 170 175 Tyr Ala Pro Thr Ile Gly Gly Val Ala Asp His Leu Ala Thr Leu Arg 180 185 190 Leu Gln Pro Ala Tyr Glu Arg Leu Leu Glu Lys Gly Arg Asp Ala Leu 195 200 205 His Ser Leu Pro Leu Ile Gly Gln Ile Ala Tyr Arg Phe Met His Ser 210 215 220 Val Lys Ala Gly Ile Lys Asp Ser Leu Ser Pro Gln Leu Leu Phe Thr 225 230 235 240 Asp Leu Gly Leu Lys Tyr Val Gly Pro Val Asp Gly His Asp Glu His 245 250 255 Ala Val Glu Val Ala Leu Arg Lys Ala Arg Gly Phe Gly Gly Pro Val 260 265 270 Ile Val His Val Val Thr Arg Lys Gly Met Gly Tyr Pro Pro Ala Glu 275 280 285 Ala Asp Gln Ala Glu Gln Met His Thr Cys Gly Val Met Asp Pro Thr 290 295 300 Thr Gly Gln Pro Thr Lys Ile Ala Ala Pro Asp Trp Thr Ala Ile Phe 305 310 315 320 Ser Asp Ala Leu Ile Gly Tyr Ala Met Lys Arg Arg Asp Ile Val Ala 325 330 335 Ile Thr Ala Ala Met Pro Gly Pro Thr Gly Leu Thr Ala Phe Gly Gln 340 345 350 Cys Phe Pro Asp Arg Leu Phe Asp Val Gly Ile Ala Glu Gln His Ala 355 360 365 Met Thr Ser Ala Ala Gly Leu Ala Met Gly Arg Met His Pro Val Val 370 375 380 Ala Ile Tyr Ser Thr Phe Leu Asn Arg Ala Phe Asp Gln Ile Met Met 385 390 395 400 Asp Val Ala Leu His Lys Leu Pro Val Thr Met Val Ile Asp Arg Ala 405 410 415 Gly Ile Thr Gly Ser Asp Gly Pro Ser His Asn Gly Met Trp Asp Leu 420 425 430 Ser Met Leu Gly Ile Val Pro Gly Met Arg Val Ala Ala Pro Arg Asp 435 440 445 Ala Ile Arg Leu Arg Glu Glu Leu Gly Glu Ala Leu Asp Val Asp Asp 450 455 460 Gly Pro Thr Ala Ile Arg Phe Pro Lys Gly Asp Val Cys Glu Asp Ile 465 470 475 480 Pro Ala Leu Lys Arg Arg Ser Gly Val Asp Val Leu Ala Val Pro Ala 485 490 495 Thr Gly Leu Ala Gln Asp Val Leu Leu Val Gly Val Gly Val Phe Ala 500 505 510 Ser Met Ala Leu Ala Val Ala Lys Arg Leu His Asn Gln Gly Ile Gly 515 520 525 Val Thr Val Ile Asp Pro Arg Trp Val Leu Pro Val Cys Asp Gly Val 530 535 540 Leu Glu Leu Ala His Thr His Lys Leu Ile Val Thr Leu Glu Asp Asn 545 550 555 560 Gly Val Asn Gly Gly Val Gly Ala Ala Val Ser Thr Ala Leu Arg Gln 565 570 575 Val Glu Ile Asp Thr Pro Cys Arg Asp Val Gly Leu Pro Gln Glu Phe 580 585 590 Tyr Asp His Ala Ser Arg Ser Glu Val Leu Ala Asp Leu Gly Leu Thr 595 600 605 Asp Gln Asp Val Ala Arg Arg Ile Thr Gly Trp Val Val Ala Phe Gly 610 615 620 His Cys Gly Ser Gly Asp Asp Ala Gly Gln Tyr Gly Pro Arg Ser Ser 625 630 635 640 Gln Thr Met 24 638 PRT Mycobacterium tuberculosis 24 Met Leu Gln Gln Ile Arg Gly Pro Ala Asp Leu Gln His Leu Ser Gln 1 5 10 15 Ala Gln Leu Arg Glu Leu Ala Ala Glu Ile Arg Glu Phe Leu Ile His 20 25 30 Lys Val Ala Ala Thr Gly Gly His Leu Gly Pro Asn Leu Gly Val Val 35 40 45 Glu Leu Thr Leu Ala Leu His Arg Val Phe Asp Ser Pro His Asp Pro 50 55 60 Ile Ile Phe Asp Thr Gly His Gln Ala Tyr Val His Lys Met Leu Thr 65 70 75 80 Gly Arg Ser Gln Asp Phe Ala Thr Leu Arg Lys Lys Gly Gly Leu Ser 85 90 95 Gly Tyr Pro Ser Arg Ala Glu Ser Glu His Asp Trp Val Glu Ser Ser 100 105 110 His Ala Ser Ala Ala Leu Ser Tyr Ala Asp Gly Leu Ala Lys Ala Phe 115 120 125 Glu Leu Thr Gly His Arg Asn Arg His Val Val Ala Val Val Gly Asp 130 135 140 Gly Ala Leu Thr Gly Gly Met Cys Trp Glu Ala Leu Asn Asn Ile Ala 145 150 155 160 Ala Ser Arg Arg Pro Val Ile Ile Val Val Asn Asp Asn Gly Arg Ser 165 170 175 Tyr Ala Pro Thr Ile Gly Gly Val Ala Asp His Leu Ala Thr Leu Arg 180 185 190 Leu Gln Pro Ala Tyr Glu Gln Ala Leu Glu Thr Gly Arg Asp Leu Val 195 200 205 Arg Ala Val Pro Leu Val Gly Gly Leu Trp Phe Arg Phe Leu His Ser 210 215 220 Val Lys Ala Gly Ile Lys Asp Ser Leu Ser Pro Gln Leu Leu Phe Thr 225 230 235 240 Asp Leu Gly Leu Lys Tyr Val Gly Pro Val Asp Gly His Asp Glu Arg 245 250 255 Ala Val Glu Val Ala Leu Arg Ser Ala Arg Arg Phe Gly Ala Pro Val 260 265 270 Ile Val His Val Val Thr Arg Lys Gly Met Gly Tyr Pro Pro Ala Glu 275 280 285 Ala Asp Gln Ala Glu Gln Met His Ser Thr Val Pro Ile Asp Pro Ala 290 295 300 Thr Gly Gln Ala Thr Lys Val Ala Gly Pro Gly Trp Thr Ala Thr Phe 305 310 315 320 Ser Asp Ala Leu Ile Gly Tyr Ala Gln Lys Arg Arg Asp Ile Val Ala 325 330 335 Ile Thr Ala Ala Met Pro Gly Pro Thr Gly Leu Thr Ala Phe Gly Gln 340 345 350 Arg Phe Pro Asp Arg Leu Phe Asp Val Gly Ile Ala Glu Gln His Ala 355 360 365 Met Thr Ser Ala Ala Gly Leu Ala Met Gly Gly Leu His Pro Val Val 370 375 380 Ala Ile Tyr Ser Thr Phe Leu Asn Arg Ala Phe Asp Gln Ile Met Met 385 390 395 400 Asp Val Ala Leu His Lys Leu Pro Val Thr Met Val Leu Asp Arg Ala 405 410 415 Gly Ile Thr Gly Ser Asp Gly Ala Ser His Asn Gly Met Trp Asp Leu 420 425 430 Ser Met Leu Gly Ile Val Pro Gly Ile Arg Val Ala Ala Pro Arg Asp 435 440 445 Ala Thr Arg Leu Arg Glu Glu Leu Gly Glu Ala Leu Asp Val Asp Asp 450 455 460 Gly Pro Thr Ala Leu Arg Phe Pro Lys Gly Asp Val Gly Glu Asp Ile 465 470 475 480 Ser Ala Leu Glu Arg Arg Gly Gly Val Asp Val Leu Ala Ala Pro Ala 485 490 495 Asp Gly Leu Asn His Asp Val Leu Leu Val Ala Ile Gly Ala Phe Ala 500 505 510 Pro Met Ala Leu Ala Val Ala Lys Arg Leu His Asn Gln Gly Ile Gly 515 520 525 Val Thr Val Ile Asp Pro Arg Trp Val Leu Pro Val Ser Asp Gly Val 530 535 540 Arg Glu Leu Ala Val Gln His Lys Leu Leu Val Thr Leu Glu Asp Asn 545 550 555 560 Gly Val Asn Gly Gly Ala Gly Ser Ala Val Ser Ala Ala Leu Arg Arg 565 570 575 Ala Glu Ile Asp Val Pro Cys Arg Asp Val Gly Leu Pro Gln Glu Phe 580 585 590 Tyr Glu His Ala Ser Arg Ser Glu Val Leu Ala Asp Leu Gly Leu Thr 595 600 605 Asp Gln Asp Val Ala Arg Arg Ile Thr Gly Trp Val Ala Ala Leu Gly 610 615 620 Thr Gly Val Cys Ala Ser Asp Ala Ile Pro Glu His Leu Asp 625 630 635 25 641 PRT Rhodobacter capsulatus 25 Met Ser Ala Thr Pro Ser Arg Thr Pro His Leu Asp Arg Val Thr Gly 1 5 10 15 Pro Ala Asp Leu Lys Ala Met Ser Ile Ala Asp Leu Thr Ala Leu Ala 20 25 30 Ser Glu Val Arg Arg Glu Ile Val Glu Val Val Ser Gln Thr Gly Gly 35 40 45 His Leu Gly Ser Ser Leu Gly Val Val Glu Leu Thr Val Ala Leu His 50 55 60 Ala Val Phe Asn Ser Pro Gly Asp Lys Leu Ile Trp Asp Val Gly His 65 70 75 80 Gln Cys Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Ser Arg Met Leu 85 90 95 Thr Leu Arg Gln Ala Gly Gly Ile Ser Gly Phe Pro Lys Arg Ser Glu 100 105 110 Ser Pro His Asp Ala Phe Gly Ala Gly His Ser Ser Thr Ser Ile Ser 115 120 125 Ala Ala Leu Gly Phe Ala Val Gly Arg Glu Leu Gly Gln Pro Val Gly 130 135 140 Asp Thr Ile Ala Ile Ile Gly Asp Gly Ser Ile Thr Ala Gly Met Ala 145 150 155 160 Tyr Glu Ala Leu Asn His Ala Gly His Leu Lys Ser Arg Met Phe Val 165 170 175 Ile Leu Asn Asp Asn Asp Met Ser Ile Ala Pro Pro Val Gly Ala Leu 180 185 190 Gln His Tyr Leu Asn Thr Ile Ala Arg Gln Ala Pro Phe Ala Ala Leu 195 200 205 Lys Ala Ala Ala Glu Gly Ile Glu Met His Leu Pro Gly Pro Val Arg 210 215 220 Asp Gly Ala Arg Arg Ala Arg Gln Met Val Thr Ala Met Pro Gly Gly 225 230 235 240 Ala Thr Leu Phe Glu Glu Leu Gly Phe Asp Tyr Ile Gly Pro Val Asp 245 250 255 Gly His Asp Met Ala Glu Leu Val Glu Thr Leu Arg Val Thr Arg Ala 260 265 270 Arg Ala Ser Gly Pro Val Leu Ile His Val Cys Thr Thr Lys Gly Lys 275 280 285 Gly Tyr Ala Pro Ala Glu Gly Ala Glu Asp Lys Leu His Gly Val Ser 290 295 300 Lys Phe Asp Ile Glu Thr Gly Lys Gln Lys Lys Ser Ile Pro Asn Ala 305 310 315 320 Pro Asn Tyr Thr Ala Val Phe Gly Glu Arg Leu Thr Glu Glu Ala Ala 325 330 335 Arg Asp Gln Ala Ile Val Ala Val Thr Ala Ala Met Pro Thr Gly Thr 340 345 350 Gly Leu Asp Ile Met Gln Lys Arg Phe Pro Arg Arg Val Phe Asp Val 355 360 365 Gly Ile Ala Glu Gln His Ala Val Thr Phe Ala Ala Gly Met Ala Ala 370 375 380 Ala Gly Leu Lys Pro Phe Leu Ala Leu Tyr Ser Ser Phe Val Gln Arg 385 390 395 400 Gly Tyr Asp Gln Leu Val His Asp Val Ala Leu Gln Asn Leu Pro Val 405 410 415 Arg Leu Met Ile Asp Arg Ala Gly Leu Val Gly Gln Asp Gly Ala Thr 420 425 430 His Ala Gly Ala Phe Asp Val Ser Met Leu Ala Asn Leu Pro Asn Phe 435 440 445 Thr Val Met Ala Ala Ala Asp Glu Ala Glu Leu Cys His Met Val Val 450 455 460 Thr Ala Ala Ala His Asp Ser Gly Pro Ile Ala Leu Arg Tyr Pro Arg 465 470 475 480 Gly Glu Gly Arg Gly Val Glu Met Pro Glu Arg Gly Glu Val Leu Glu 485 490 495 Ile Gly Lys Gly Arg Val Met Thr Glu Gly Thr Glu Val Ala Ile Leu 500 505 510 Ser Phe Gly Ala His Leu Ala Gln Ala Leu Lys Ala Ala Glu Met Leu 515 520 525 Glu Ala Glu Gly Val Ser Thr Thr Val Ala Asp Ala Arg Phe Cys Arg 530 535 540 Pro Leu Asp Thr Asp Leu Ile Asp Arg Leu Ile Glu Gly His Ala Ala 545 550 555 560 Leu Ile Thr Leu Glu Gln Gly Ala Met Gly Gly Phe Gly Ala Met Val 565 570 575 Leu His Tyr Leu Ala Arg Thr Gly Gln Leu Glu Lys Gly Arg Ala Ile 580 585 590 Arg Thr Met Thr Leu Pro Asp Cys Tyr Ile Asp His Gly Ser Pro Glu 595 600 605 Glu Met Tyr Ala Trp Ala Gly Leu Thr Ala Asn Asp Ile Arg Asp Thr 610 615 620 Ala Leu Ala Ala Ala Arg Pro Ser Lys Ser Val Arg Ile Val His Ser 625 630 635 640 Ala 26 637 PRT Rhodobacter sphaeroides 26 Met Thr Asp Arg Pro Cys Thr Pro Thr Leu Asp Arg Val Thr Leu Pro 1 5 10 15 Val Asp Ile Lys Gly Leu Thr Asp Arg Glu Leu Arg Ser Leu Ala Asp 20 25 30 Glu Leu Arg Ala Glu Thr Ile Ser Ala Val Ser Val Thr Gly Gly His 35 40 45 Leu Gly Ala Gly Leu Gly Val Val Glu Leu Thr Val Ala Leu His Ala 50 55 60 Ile Phe Asp Ala Pro Arg Asp Lys Ile Ile Trp Asp Val Gly His Gln 65 70 75 80 Cys Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Asp Arg Ile Arg Thr 85 90 95 Leu Arg Gln Gly Gly Gly Leu Ser Gly Phe Thr Lys Arg Ser Glu Ser 100 105 110 Pro Tyr Asp Cys Phe Gly Ala Gly His Ser Ser Thr Ser Ile Ser Ala 115 120 125 Ala Val Gly Phe Ala Ala Ala Arg Glu Met Gly Gly Asp Thr Gly Asp 130 135 140 Ala Val Ala Val Ile Gly Asp Gly Ser Met Ser Ala Gly Met Ala Phe 145 150 155 160 Glu Ala Leu Asn His Gly Gly His Leu Lys Asn Arg Val Ile Val Ile 165 170 175 Leu Asn Asp Asn Glu Met Ser Ile Ala Pro Pro Val Gly Ala Leu Ser 180 185 190 Ser Tyr Leu Ser Arg Leu Tyr Ala Gly Ala Pro Phe Gln Asp Phe Lys 195 200 205 Ala Ala Ala Lys Gly Ala Leu Gly Leu Leu Pro Glu Pro Phe Gln Glu 210 215 220 Gly Ala Arg Arg Ala Lys Glu Met Leu Lys Ser Val Thr Val Gly Gly 225 230 235 240 Thr Leu Phe Glu Glu Leu Gly Phe Ser Tyr Val Gly Pro Ile Asp Gly 245 250 255 His Asp Leu Asp Gln Leu Leu Pro Val Leu Arg Thr Val Lys Gln Arg 260 265 270 Ala His Ala Pro Val Leu Ile His Val Ile Thr Lys Lys Gly Arg Gly 275 280 285 Tyr Ala Pro Ala Glu Ala Ala Arg Asp Arg Gly His Ala Thr Asn Lys 290 295 300 Phe Asn Val Leu Thr Gly Ala Gln Val Lys Pro Val Ser Asn Ala Pro 305 310 315 320 Ser Tyr Thr Lys Val Phe Ala Gln Ser Leu Ile Lys Glu Ala Glu Val 325 330 335 Asp Glu Arg Ile Cys Ala Val Thr Ala Ala Met Pro Asp Gly Thr Gly 340 345 350 Leu Asn Leu Phe Gly Glu Arg Phe Pro Lys Arg Thr Phe Asp Val Gly 355 360 365 Ile Ala Glu Gln His Ala Val Thr Phe Ser Ala Ala Leu Ala Ala Gly 370 375 380 Gly Met Arg Pro Phe Cys Ala Ile Tyr Ser Thr Phe Leu Gln Arg Gly 385 390 395 400 Tyr Asp Gln Ile Val His Asp Val Ala Ile Gln Arg Leu Pro Val Arg 405 410 415 Phe Ala Ile Asp Arg Ala Gly Leu Val Gly Ala Asp Gly Ala Thr His 420 425 430 Ala Gly Ser Phe Asp Val Ala Phe Leu Ser Asn Leu Pro Gly Ile Val 435 440 445 Val Met Ala Ala Ala Asp Glu Ala Glu Leu Val His Met Val Ala Thr 450 455 460 Ala Ala Ala His Asp Glu Gly Pro Ile Ala Phe Arg Tyr Pro Arg Gly 465 470 475 480 Asp Gly Val Gly Val Glu Val Pro Val Lys Gly Val Pro Leu Gln Ile 485 490 495 Gly Arg Gly Arg Val Val Ser Glu Gly Thr Arg Ile Ala Leu Leu Ser 500 505 510 Phe Gly Thr Arg Leu Ala Glu Val Gln Val Ala Ala Glu Ala Leu Ala 515 520 525 Ala Arg Gly Ile Ser Pro Thr Val Ala Asp Ala Arg Phe Ala Lys Pro 530 535 540 Leu Asp Arg Asp Leu Ile Leu Gln Leu Ala Ala His His Glu Ala Leu 545 550 555 560 Ile Thr Ile Glu Glu Gly Ala Ile Gly Gly Phe Gly Ser His Val Ala 565 570 575 Gln Leu Leu Ala Glu Ala Gly Val Phe Asp Arg Gly Phe Arg Tyr Arg 580 585 590 Ser Met Val Leu Pro Asp Thr Phe Ile Asp His Asn Ser Ala Glu Val 595 600 605 Met Tyr Ala Thr Ala Gly Leu Asn Ala Ala Asp Ile Glu Arg Lys Ala 610 615 620 Leu Glu Thr Leu Gly Val Glu Val Leu Ala Arg Arg Ala 625 630 635 27 648 PRT Rhodobacter sphaeroides 27 Met Thr Asn Pro Thr Pro Arg Pro Glu Thr Pro Leu Leu Asp Arg Val 1 5 10 15 Cys Cys Pro Ala Asp Met Lys Ala Leu Ser Asp Ala Glu Leu Glu Arg 20 25 30 Leu Ala Asp Glu Val Arg Ser Glu Val Ile Ser Val Val Ala Glu Thr 35 40 45 Gly Gly His Leu Gly Ser Ser Leu Gly Val Val Glu Leu Thr Val Ala 50 55 60 Leu His Ala Val Phe Asn Thr Pro Thr Asp Lys Leu Val Trp Asp Val 65 70 75 80 Gly His Gln Cys Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Glu Gln 85 90 95 Met Arg Thr Leu Arg Gln Lys Gly Gly Leu Ser Gly Phe Thr Lys Arg 100 105 110 Ser Glu Ser Ala Tyr Asp Pro Phe Gly Ala Ala His Ser Ser Thr Ser 115 120 125 Ile Ser Ala Ala Leu Gly Phe Ala Met Gly Arg Glu Leu Gly Gln Pro 130 135 140 Val Gly Asp Thr Ile Ala Val Ile Gly Asp Gly Ser Ile Thr Ala Gly 145 150 155 160 Met Ala Tyr Glu Ala Leu Asn His Ala Gly His Leu Asn Lys Arg Leu 165 170 175 Phe Val Ile Leu Asn Asp Asn Asp Met Ser Ile Ala Pro Pro Val Gly 180 185 190 Ala Leu Ala Arg Tyr Leu Val Asn Leu Ser Ser Lys Ala Pro Phe Ala 195 200 205 Thr Leu Arg Ala Ala Ala Asp Gly Leu Glu Ala Ser Leu Pro Gly Pro 210 215 220 Leu Arg Asp Gly Ala Arg Arg Ala Arg Gln Leu Val Thr Gly Met Pro 225 230 235 240 Gly Gly Gly Thr Leu Phe Glu Glu Leu Gly Phe Thr Tyr Val Gly Pro 245 250 255 Ile Asp Gly His Asp Met Glu Ala Leu Leu Gln Thr Leu Arg Ala Ala 260 265 270 Arg Ala Arg Thr Thr Gly Pro Val Leu Ile His Val Val Thr Lys Lys 275 280 285 Gly Lys Gly Tyr Ala Pro Ala Glu Asn Ala Pro Asp Lys Tyr His Gly 290 295 300 Val Asn Lys Phe Asp Pro Val Thr Gly Glu Gln Lys Lys Ser Val Ala 305 310 315 320 Asn Ala Pro Asn Tyr Thr Lys Val Phe Gly Ser Thr Leu Thr Glu Glu 325 330 335 Ala Ala Arg Asp Pro Arg Ile Val Ala Ile Thr Ala Ala Met Pro Ser 340 345 350 Gly Thr Gly Val Asp Ile Met Gln Lys Arg Phe Pro Asn Arg Val Phe 355 360 365 Asp Val Gly Ile Ala Glu Gln His Ala Val Thr Phe Ala Ala Gly Leu 370 375 380 Ala Gly Ala Gly Met Lys Pro Phe Cys Ala Ile Tyr Ser Ser Phe Leu 385 390 395 400 Gln Arg Gly Tyr Asp Gln Ile Ala His Asp Val Ala Leu Gln Asn Leu 405 410 415 Pro Val Arg Phe Val Ile Asp Arg Ala Gly Leu Val Gly Ala Asp Gly 420 425 430 Ala Thr His Ala Gly Ala Phe Asp Val Gly Phe Ile Thr Ser Leu Pro 435 440 445 Asn Met Thr Val Met Ala Ala Ala Asp Glu Ala Glu Leu Ile His Met 450 455 460 Ile Ala Thr Ala Val Ala Phe Gly Glu Gly Pro Ile Ala Phe Arg Phe 465 470 475 480 Pro Arg Gly Glu Gly Val Gly Val Glu Met Pro Glu Arg Gly Thr Val 485 490 495 Leu Glu Pro Gly Arg Gly Arg Val Val Arg Glu Gly Thr Asp Val Ala 500 505 510 Ile Leu Ser Phe Gly Ala His Leu His Glu Ala Leu Gln Ala Ala Lys 515 520 525 Leu Leu Glu Ala Glu Gly Val Ser Val Thr Val Ala Asp Ala Arg Phe 530 535 540 Ser Arg Pro Leu Asp Thr Gly His Ile Asp Gln Leu Val Arg His His 545 550 555 560 Ala Ala Leu Val Thr Val Glu Gln Gly Ala Met Gly Gly Phe Gly Ala 565 570 575 Tyr Val Met His Cys Leu Ala Asn Ser Gly Gly Phe Asp Gly Gly Leu 580 585 590 Ala Leu Arg Val Met Thr Leu Pro Asp Arg Phe Ile Glu Gln Ala Ser 595 600 605 Pro Glu Asp Met Tyr Ala Asp Ala Gly Leu Arg Ala Glu Asp Ile Ala 610 615 620 Ala Thr Ala Arg Gly Ala Leu Ala Arg Gly Arg Val Met Pro Leu Arg 625 630 635 640 Gln Thr Ala Lys Pro Arg Ala Val 645 28 636 PRT Synechococcus sp. PCC 6301 28 Met His Leu Ser Glu Ile Thr His Pro Asn Gln Leu His Gly Leu Ser 1 5 10 15 Val Ala Gln Leu Glu Gln Ile Gly His Gln Ile Arg Glu Lys His Leu 20 25 30 Gln Thr Val Ala Ala Thr Gly Gly His Leu Gly Pro Gly Leu Gly Val 35 40 45 Val Glu Leu Thr Leu Ala Leu Tyr Gln Thr Leu Asp Leu Asp Arg Asp 50 55 60 Lys Val Val Trp Asp Val Gly His Gln Ala Tyr Pro His Lys Leu Leu 65 70 75 80 Thr Gly Arg Tyr His Asn Phe His Thr Leu Arg Gln Lys Asp Gly Ile 85 90 95 Ala Gly Tyr Pro Lys Arg Thr Glu Asn Arg Phe Asp His Phe Gly Ala 100 105 110 Gly His Ala Ser Thr Ser Ile Ser Ala Gly Leu Gly Met Ala Leu Ala 115 120 125 Arg Asp Ala Gln Gly Glu Asp Tyr Arg Cys Val Ala Val Ile Gly Asp 130 135 140 Gly Ser Leu Thr Gly Gly Met Ala Leu Glu Ala Ile Asn His Ala Gly 145 150 155 160 His Leu Pro Lys Thr Arg Leu Leu Val Val Leu Asn Asp Asn Asp Met 165 170 175 Ser Ile Ser Pro Asn Val Gly Ala Leu Ser Arg Tyr Leu Asn Lys Ile 180 185 190 Arg Val Ser Glu Pro Met Gln Leu Leu Thr Asp Gly Leu Thr Gln Gly 195 200 205 Met Gln Gln Ile Pro Phe Val Gly Gly Ala Ile Thr Gln Gly Phe Glu 210 215 220 Pro Val Lys Glu Gly Met Lys Arg Leu Ser Tyr Ser Lys Ile Gly Ala 225 230 235 240 Val Phe Glu Glu Leu Gly Phe Thr Tyr Met Gly Pro Val Asp Gly His 245 250 255 Asn Leu Glu Glu Leu Ile Ala Thr Phe Arg Glu Ala His Lys His Thr 260 265 270 Gly Pro Val Leu Val His Val Ala Thr Thr Lys Gly Lys Gly Tyr Pro 275 280 285 Tyr Ala Glu Glu Asp Gln Val Gly Tyr His Ala Gln Asn Pro Phe Asp 290 295 300 Leu Ala Thr Gly Lys Ala Lys Pro Ala Ser Lys Pro Lys Pro Pro Ser 305 310 315 320 Tyr Ser Lys Val Phe Gly Gln Thr Leu Thr Thr Leu Ala Lys Ser Asp 325 330 335 Arg Arg Ile Val Gly Ile Thr Ala Ala Met Ala Thr Gly Thr Gly Leu 340 345 350 Asp Ile Leu Gln Lys Ala Leu Pro Lys Gln Tyr Ile Asp Val Gly Ile 355 360 365 Ala Glu Gln His Ala Val Val Leu Ala Ala Gly Met Ala Cys Asp Gly 370 375 380 Met Arg Pro Val Val Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala Phe 385 390 395 400 Asp Gln Val Ile His Asp Val Cys Ile Gln Lys Leu Pro Val Phe Phe 405 410 415 Cys Leu Asp Arg Ala Gly Ile Val Gly Ala Asp Gly Pro Thr His Gln 420 425 430 Gly Met Tyr Asp Ile Ala Tyr Leu Arg Leu Ile Pro Asn Met Val Leu 435 440 445 Met Ala Pro Lys Asp Glu Ala Glu Leu Gln Arg Met Leu Val Thr Gly 450 455 460 Ile Glu Tyr Asp Gly Pro Ile Ala Met Arg Phe Pro Arg Gly Asn Gly 465 470 475 480 Ile Gly Val Pro Leu Pro Glu Glu Gly Trp Glu Ser Leu Pro Ile Gly 485 490 495 Lys Ala Glu Gln Leu Arg Gln Gly Asp Asp Leu Leu Met Leu Ala Tyr 500 505 510 Gly Ser Met Val Tyr Pro Ala Leu Gln Thr Ala Glu Leu Leu Asn Glu 515 520 525 His Gly Ile Ser Ala Thr Val Ile Asn Ala Arg Phe Ala Lys Pro Leu 530 535 540 Asp Glu Glu Leu Ile Val Pro Leu Ala Arg Gln Ile Gly Lys Val Val 545 550 555 560 Thr Phe Glu Glu Gly Cys Leu Pro Gly Gly Phe Gly Ser Ala Ile Met 565 570 575 Glu Ser Leu Gln Ala His Asp Leu Gln Val Pro Val Leu Pro Ile Gly 580 585 590 Val Pro Asp Leu Leu Val Glu His Ala Ser Pro Asp Glu Ser Lys Gln 595 600 605 Glu Leu Gly Leu Thr Pro Arg Gln Met Ala Asp Arg Ile Leu Glu Lys 610 615 620 Phe Gly Ser Arg Gln Arg Ile Gly Ala Ala Ser Ala 625 630 635 29 640 PRT Synechocystis sp. PCC 6803 29 Met His Ile Ser Glu Leu Thr His Pro Asn Glu Leu Lys Gly Leu Ser 1 5 10 15 Ile Arg Glu Leu Glu Glu Val Ser Arg Gln Ile Arg Glu Lys His Leu 20 25 30 Gln Thr Val Ala Thr Ser Gly Gly His Leu Gly Pro Gly Leu Gly Val 35 40 45 Val Glu Leu Thr Val Ala Leu Tyr Ser Thr Leu Asp Leu Asp Lys Asp 50 55 60 Arg Val Ile Trp Asp Val Gly His Gln Ala Tyr Pro His Lys Met Leu 65 70 75 80 Thr Gly Arg Tyr His Asp Phe His Thr Leu Arg Gln Lys Asp Gly Val 85 90 95 Ala Gly Tyr Leu Lys Arg Ser Glu Ser Arg Phe Asp His Phe Gly Ala 100 105 110 Gly His Ala Ser Thr Ser Ile Ser Ala Gly Leu Gly Met Ala Leu Ala 115 120 125 Arg Asp Ala Lys Gly Glu Asp Phe Lys Val Val Ser Ile Ile Gly Asp 130 135 140 Gly Ala Leu Thr Gly Gly Met Ala Leu Glu Ala Ile Asn His Ala Gly 145 150 155 160 His Leu Pro His Thr Arg Leu Met Val Ile Leu Asn Asp Asn Glu Met 165 170 175 Ser Ile Ser Pro Asn Val Gly Ala Ile Ser Arg Tyr Leu Asn Lys Val 180 185 190 Arg Leu Ser Ser Pro Met Gln Phe Leu Thr Asp Asn Leu Glu Glu Gln 195 200 205 Ile Lys His Leu Pro Phe Val Gly Asp Ser Leu Thr Pro Glu Met Glu 210 215 220 Arg Val Lys Glu Gly Met Lys Arg Leu Val Val Pro Lys Val Gly Ala 225 230 235 240 Val Ile Glu Glu Leu Gly Phe Lys Tyr Phe Gly Pro Ile Asp Gly His 245 250 255 Ser Leu Gln Glu Leu Ile Asp Thr Phe Lys Gln Ala Glu Lys Val Pro 260 265 270 Gly Pro Val Phe Val His Val Ser Thr Thr Lys Gly Lys Gly Tyr Asp 275 280 285 Leu Ala Glu Lys Asp Gln Val Gly Tyr His Ala Gln Ser Pro Phe Asn 290 295 300 Leu Ser Thr Gly Lys Ala Tyr Pro Ser Ser Lys Pro Lys Pro Pro Ser 305 310 315 320 Tyr Ser Lys Val Phe Ala His Thr Leu Thr Thr Leu Ala Lys Glu Asn 325 330 335 Pro Asn Ile Val Gly Ile Thr Ala Ala Met Ala Thr Gly Thr Gly Leu 340 345 350 Asp Lys Leu Gln Ala Lys Leu Pro Lys Gln Tyr Val Asp Val Gly Ile 355 360 365 Ala Glu Gln His Ala Val Thr Leu Ala Ala Gly Met Ala Cys Glu Gly 370 375 380 Ile Arg Pro Val Val Ala Ile Tyr Ser Thr Phe Leu Gln Arg Gly Tyr 385 390 395 400 Asp Gln Ile Ile His Asp Val Cys Ile Gln Lys Leu Pro Val Phe Phe 405 410 415 Cys Leu Asp Arg Ala Gly Ile Val Gly Ala Asp Gly Pro Thr His Gln 420 425 430 Gly Met Tyr Asp Ile Ala Tyr Leu Arg Cys Ile Pro Asn Leu Val Leu 435 440 445 Met Ala Pro Lys Asp Glu Ala Glu Leu Gln Gln Met Leu Val Thr Gly 450 455 460 Val Asn Tyr Thr Gly Gly Ala Ile Ala Met Arg Tyr Pro Arg Gly Asn 465 470 475 480 Gly Ile Gly Val Pro Leu Met Glu Glu Gly Trp Glu Pro Leu Glu Ile 485 490 495 Gly Lys Ala Glu Ile Leu Arg Ser Gly Asp Asp Val Leu Leu Leu Gly 500 505 510 Tyr Gly Ser Met Val Tyr Pro Ala Leu Gln Thr Ala Glu Leu Leu His 515 520 525 Glu His Gly Ile Glu Ala Thr Val Val Asn Ala Arg Phe Val Lys Pro 530 535 540 Leu Asp Thr Glu Leu Ile Leu Pro Leu Ala Glu Arg Ile Gly Lys Val 545 550 555 560 Val Thr Met Glu Glu Gly Cys Leu Met Gly Gly Phe Gly Ser Ala Val 565 570 575 Ala Glu Ala Leu Met Asp Asn Asn Val Leu Val Pro Leu Lys Arg Leu 580 585 590 Gly Val Pro Asp Ile Leu Val Asp His Ala Thr Pro Glu Gln Ser Thr 595 600 605 Val Asp Leu Gly Leu Thr Pro Ala Gln Met Ala Gln Asn Ile Met Ala 610 615 620 Ser Leu Phe Lys Thr Glu Thr Glu Ser Val Val Ala Pro Gly Val Ser 625 630 635 640 30 608 PRT Thermotoga maritima 30 Met Leu Leu Asp Glu Ile Lys Arg Met Ser Tyr Asp Glu Leu Lys Arg 1 5 10 15 Leu Ala Glu Asp Ile Arg Lys Arg Ile Thr Glu Val Val Leu Lys Asn 20 25 30 Gly Gly His Leu Ala Ser Asn Leu Gly Thr Ile Glu Leu Thr Leu Ala 35 40 45 Leu Tyr Arg Val Phe Asp Pro Arg Glu Asp Ala Ile Ile Trp Asp Thr 50 55 60 Gly His Gln Ala Tyr Thr His Lys Ile Leu Thr Gly Arg Asp Asp Leu 65 70 75 80 Phe His Thr Ile Arg Thr Phe Gly Gly Leu Ser Gly Phe Val Thr Arg 85 90 95 Arg Glu Ser Pro Leu Asp Trp Phe Gly Thr Gly His Ala Gly Thr Ser 100 105 110 Ile Ala Ala Gly Leu Gly Phe Glu Lys Ala Phe Glu Leu Leu Gly Glu 115 120 125 Lys Arg His Val Val Val Val Ile Gly Asp Gly Ala Leu Thr Ser Gly 130 135 140 Met Ala Leu Glu Ala Leu Asn Gln Leu Lys Asn Leu Asn Ser Lys Met 145 150 155 160 Lys Ile Ile Leu Asn Asp Asn Gly Met Ser Ile Ser Pro Asn Val Gly 165 170 175 Gly Leu Ala Tyr His Leu Ser Lys Leu Arg Thr Ser Pro Ile Tyr Leu 180 185 190 Lys Gly Lys Lys Val Leu Lys Lys Val Leu Glu Lys Thr Glu Ile Gly 195 200 205 Phe Glu Val Glu Glu Glu Met Lys Tyr Leu Arg Asp Ser Leu Lys Gly 210 215 220 Met Ile Gln Gly Thr Asn Phe Phe Glu Ser Leu Gly Leu Lys Tyr Phe 225 230 235 240 Gly Pro Phe Asp Gly His Asn Ile Glu Leu Leu Glu Lys Val Phe Lys 245 250 255 Arg Ile Arg Asp Tyr Asp Tyr Ser Ser Val Val His Val Val Thr Lys 260 265 270 Lys Gly Lys Gly Phe Thr Ala Ala Glu Glu Asn Pro Thr Lys Tyr His 275 280 285 Ser Ala Ser Pro Ser Gly Lys Pro Lys Met Leu Ser Tyr Ser Glu Leu 290 295 300 Leu Gly His Thr Leu Ser Arg Val Ala Arg Glu Asp Lys Lys Ile Val 305 310 315 320 Ala Ile Thr Ala Ala Met Ala Asp Gly Thr Gly Leu Ser Ile Phe Gln 325 330 335 Lys Glu His Pro Asp Arg Phe Phe Asp Leu Gly Ile Thr Glu Gln Thr 340 345 350 Cys Val Thr Phe Gly Ala Ala Leu Gly Leu His Gly Met Lys Pro Val 355 360 365 Val Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala Tyr Asp Gln Ile Ile 370 375 380 His Asp Val Ala Leu Gln Asn Ala Pro Val Leu Phe Ala Ile Asp Arg 385 390 395 400 Ser Gly Val Val Gly Glu Asp Gly Pro Thr His His Gly Leu Phe Asp 405 410 415 Ile Asn Tyr Leu Leu Pro Val Pro Asn Met Lys Ile Ile Ser Pro Ser 420 425 430 Ser Pro Glu Glu Phe Val Asn Ser Leu Tyr Thr Val Leu Lys His Leu 435 440 445 Asp Gly Pro Val Ala Ile Arg Tyr Pro Lys Glu Ser Phe Tyr Gly Glu 450 455 460 Val Glu Ser Leu Leu Glu Asn Met Lys Glu Ile Asp Leu Gly Trp Lys 465 470 475 480 Ile Leu Lys Arg Gly Arg Glu Ala Ala Ile Ile Ala Thr Gly Thr Ile 485 490 495 Leu Asn Glu Val Leu Lys Ile Pro Leu Asp Val Thr Val Val Asn Ala 500 505 510 Leu Thr Val Lys Pro Leu Asp Thr Ala Val Leu Lys Glu Ile Ala Arg 515 520 525 Asp His Asp Leu Ile Ile Thr Val Glu Glu Ala Met Lys Ile Gly Gly 530 535 540 Phe Gly Ser Phe Val Ala Gln Arg Leu Gln Glu Met Gly Trp Gln Gly 545 550 555 560 Lys Ile Val Asn Leu Gly Val Glu Asp Leu Phe Val Pro His Gly Gly 565 570 575 Arg Lys Glu Leu Leu Ser Met Leu Gly Leu Asp Ser Glu Gly Leu Thr 580 585 590 Lys Thr Val Leu Thr Tyr Ile Lys Ala Arg Ser Arg Glu Gly Lys Val 595 600 605 31 620 PRT Escherichia coli 31 Met Ser Phe Asp Ile Ala Lys Tyr Pro Thr Leu Ala Leu Val Asp Ser 1 5 10 15 Thr Gln Glu Leu Arg Leu Leu Pro Lys Glu Ser Leu Pro Lys Leu Cys 20 25 30 Asp Glu Leu Arg Arg Tyr Leu Leu Asp Ser Val Ser Arg Ser Ser Gly 35 40 45 His Phe Ala Ser Gly Leu Gly Thr Val Glu Leu Thr Val Ala Leu His 50 55 60 Tyr Val Tyr Asn Thr Pro Phe Asp Gln Leu Ile Trp Asp Val Gly His 65 70 75 80 Gln Ala Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Asp Lys Ile Gly 85 90 95 Thr Ile Arg Gln Lys Gly Gly Leu His Pro Phe Pro Trp Arg Gly Glu 100 105 110 Ser Glu Tyr Asp Val Leu Ser Val Gly His Ser Ser Thr Ser Ile Ser 115 120 125 Ala Gly Ile Gly Ile Ala Val Ala Ala Glu Lys Glu Gly Lys Asn Arg 130 135 140 Arg Thr Val Cys Val Ile Gly Asp Gly Ala Ile Thr Ala Gly Met Ala 145 150 155 160 Phe Glu Ala Met Asn His Ala Gly Asp Ile Arg Pro Asp Met Leu Val 165 170 175 Ile Leu Asn Asp Asn Glu Met Ser Ile Ser Glu Asn Val Gly Ala Leu 180 185 190 Asn Asn His Leu Ala Gln Leu Leu Ser Gly Lys Leu Tyr Ser Ser Leu 195 200 205 Arg Glu Gly Gly Lys Lys Val Phe Ser Gly Val Pro Pro Ile Lys Glu 210 215 220 Leu Leu Lys Arg Thr Glu Glu His Ile Lys Gly Met Val Val Pro Gly 225 230 235 240 Thr Leu Phe Glu Glu Leu Gly Phe Asn Tyr Ile Gly Pro Val Asp Gly 245 250 255 His Asp Val Leu Gly Leu Ile Thr Thr Leu Lys Asn Met Arg Asp Leu 260 265 270 Lys Gly Pro Gln Phe Leu His Ile Met Thr Lys Lys Gly Arg Gly Tyr 275 280 285 Glu Pro Ala Glu Lys Asp Pro Ile Thr Phe His Ala Val Pro Lys Phe 290 295 300 Asp Pro Ser Ser Gly Cys Leu Pro Lys Ser Ser Gly Gly Leu Pro Ser 305 310 315 320 Tyr Ser Lys Ile Phe Gly Asp Trp Leu Cys Glu Thr Ala Ala Lys Asp 325 330 335 Asn Lys Leu Met Ala Ile Thr Pro Ala Met Arg Glu Gly Ser Gly Met 340 345 350 Val Glu Phe Ser Arg Lys Phe Pro Asp Arg Tyr Phe Asp Val Ala Ile 355 360 365 Ala Glu Gln His Ala Val Thr Phe Ala Ala Gly Leu Ala Ile Gly Gly 370 375 380 Tyr Lys Pro Ile Val Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala Tyr 385 390 395 400 Asp Gln Val Leu His Asp Val Ala Ile Gln Lys Leu Pro Val Leu Phe 405 410 415 Ala Ile Asp Arg Ala Gly Ile Val Gly Ala Asp Gly Gln Thr His Gln 420 425 430 Gly Ala Phe Asp Leu Ser Tyr Leu Arg Cys Ile Pro Glu Met Val Ile 435 440 445 Met Thr Pro Ser Asp Glu Asn Glu Cys Arg Gln Met Leu Tyr Thr Gly 450 455 460 Tyr His Tyr Asn Asp Gly Pro Ser Ala Val Arg Tyr Pro Arg Gly Asn 465 470 475 480 Ala Val Gly Val Glu Leu Thr Pro Leu Glu Lys Leu Pro Ile Gly Lys 485 490 495 Gly Ile Val Lys Arg Arg Gly Glu Lys Leu Ala Ile Leu Asn Phe Gly 500 505 510 Thr Leu Met Pro Glu Ala Ala Lys Val Ala Glu Ser Leu Asn Ala Thr 515 520 525 Leu Val Asp Met Arg Phe Val Lys Pro Leu Asp Glu Ala Leu Ile Leu 530 535 540 Glu Met Ala Ala Ser His Glu Ala Leu Val Thr Val Glu Glu Asn Ala 545 550 555 560 Ile Met Gly Gly Ala Gly Ser Gly Val Asn Glu Val Leu Met Ala His 565 570 575 Arg Lys Pro Val Pro Val Leu Asn Ile Gly Leu Pro Asp Phe Phe Ile 580 585 590 Pro Gln Gly Thr Gln Glu Glu Met Arg Ala Glu Leu Gly Leu Asp Ala 595 600 605 Ala Gly Met Glu Ala Lys Ile Lys Ala Trp Leu Ala 610 615 620 32 637 PRT Neisseria meningitidis 32 Met Asn Pro Ser Pro Leu Leu Asp Leu Ile Asp Ser Pro Gln Asp Leu 1 5 10 15 Arg Arg Leu Asp Lys Lys Gln Leu Pro Arg Leu Ala Gly Glu Leu Arg 20 25 30 Thr Phe Leu Leu Glu Ser Val Gly Gln Thr Gly Gly His Phe Ala Ser 35 40 45 Asn Leu Gly Ala Val Glu Leu Thr Val Ala Leu His Tyr Val Tyr Asn 50 55 60 Thr Pro Glu Asp Lys Leu Val Trp Asp Val Gly His Gln Ser Tyr Pro 65 70 75 80 His Lys Ile Leu Thr Gly Arg Lys Asn Gln Met His Thr Met Arg Gln 85 90 95 Tyr Gly Gly Leu Ala Gly Phe Pro Lys Arg Cys Glu Ser Glu Tyr Asp 100 105 110 Ala Phe Gly Val Gly His Ser Ser Thr Ser Ile Gly Ala Ala Leu Gly 115 120 125 Met Ala Ala Ala Asp Lys Gln Leu Gly Ser Asp Arg Arg Ser Val Ala 130 135 140 Ile Ile Gly Asp Gly Ala Met Thr Ala Gly Gln Ala Phe Glu Ala Leu 145 150 155 160 Asn Cys Ala Gly Asp Met Asp Val Asp Leu Leu Val Val Leu Asn Asp 165 170 175 Asn Glu Met Ser Ile Ser Pro Asn Val Gly Ala Leu Pro Lys Tyr Leu 180 185 190 Ala Ser Asn Val Val Arg Asp Met His Gly Leu Leu Ser Thr Val Lys 195 200 205 Ala Gln Thr Gly Lys Val Leu Asp Lys Ile Pro Gly Ala Met Glu Phe 210 215 220 Ala Gln Lys Val Glu His Lys Ile Lys Thr Leu Ala Glu Glu Ala Glu 225 230 235 240 His Ala Lys Gln Ser Leu Ser Leu Phe Glu Asn Phe Gly Phe Arg Tyr 245 250 255 Thr Gly Pro Val Asp Gly His Asn Val Glu Asn Leu Val Asp Val Leu 260 265 270 Glu Asp Leu Arg Gly Arg Lys Gly Pro Gln Leu Leu His Val Ile Thr 275 280 285 Lys Lys Gly Asn Gly Tyr Lys Leu Ala Glu Asn Asp Pro Val Lys Tyr 290 295 300 His Ala Val Ala Asn Leu Pro Lys Glu Ser Ala Ala Gln Met Pro Ser 305 310 315 320 Glu Lys Glu Pro Lys Pro Ala Ala Lys Pro Thr Tyr Thr Gln Val Phe 325 330 335 Gly Lys Trp Leu Cys Asp Arg Ala Ala Ala Asp Ser Arg Leu Val Ala 340 345 350 Ile Thr Pro Ala Met Arg Glu Gly Ser Gly Leu Val Glu Phe Glu Gln 355 360 365 Arg Phe Pro Asp Arg Tyr Phe Asp Val Gly Ile Ala Glu Gln His Ala 370 375 380 Val Thr Phe Ala Gly Gly Leu Ala Cys Glu Gly Met Lys Pro Val Val 385 390 395 400 Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala Tyr Asp Gln Leu Val His 405 410 415 Asp Ile Ala Leu Gln Asn Leu Pro Val Leu Phe Ala Val Asp Arg Ala 420 425 430 Gly Ile Val Gly Ala Asp Gly Pro Thr His Ala Gly Leu Tyr Asp Leu 435 440 445 Ser Phe Leu Arg Cys Ile Pro Asn Met Ile Val Ala Ala Pro Ser Asp 450 455 460 Glu Asn Glu Cys Arg Leu Leu Leu Ser Thr Cys Tyr Gln Ala Asp Ala 465 470 475 480 Pro Ala Ala Val Arg Tyr Pro Arg Gly Thr Gly Thr Gly Val Pro Val 485 490 495 Ser Asp Gly Met Glu Thr Val Glu Ile Gly Lys Gly Ile Ile Arg Arg 500 505 510 Glu Gly Glu Lys Thr Ala Phe Ile Ala Phe Gly Ser Met Val Ala Pro 515 520 525 Ala Leu Ala Val Ala Gly Lys Leu Asn Ala Thr Val Ala Asp Met Arg 530 535 540 Phe Val Lys Pro Ile Asp Glu Glu Leu Ile Val Arg Leu Ala Arg Ser 545 550 555 560 His Asp Arg Ile Val Thr Leu Glu Glu Asn Ala Glu Gln Gly Gly Ala 565 570 575 Gly Ser Ala Val Leu Glu Val Leu Ala Lys His Gly Ile Cys Lys Pro 580 585 590 Val Leu Leu Leu Gly Val Ala Asp Thr Val Thr Gly His Gly Asp Pro 595 600 605 Lys Lys Leu Leu Asp Asp Leu Gly Leu Ser Ala Glu Ala Val Glu Arg 610 615 620 Arg Val Arg Ala Trp Leu Ser Asp Arg Asp Ala Ala Asn 625 630 635 33 625 PRT Haemophilus influenzae 33 Met Thr Asn Asn Met Asn Asn Tyr Pro Leu Leu Ser Leu Ile Asn Ser 1 5 10 15 Pro Glu Asp Leu Arg Leu Leu Asn Lys Asp Gln Leu Pro Gln Leu Cys 20 25 30 Gln Glu Leu Arg Ala Tyr Leu Leu Glu Ser Val Ser Gln Thr Ser Gly 35 40 45 His Leu Ala Ser Gly Leu Gly Thr Val Glu Leu Thr Val Ala Leu His 50 55 60 Tyr Val Tyr Lys Thr Pro Phe Asp Gln Leu Ile Trp Asp Val Gly His 65 70 75 80 Gln Ala Tyr Pro His Lys Ile Leu Thr Gly Arg Arg Glu Gln Met Ser 85 90 95 Thr Ile Arg Gln Lys Asp Gly Ile His Pro Phe Pro Trp Arg Glu Glu 100 105 110 Ser Glu Phe Asp Val Leu Ser Val Gly His Ser Ser Thr Ser Ile Ser 115 120 125 Ala Gly Leu Gly Ile Ala Val Ala Ala Glu Arg Glu Asn Ala Gly Arg 130 135 140 Lys Thr Val Cys Val Ile Gly Asp Gly Ala Ile Thr Ala Gly Met Ala 145 150 155 160 Phe Glu Ala Leu Asn His Ala Gly Ala Leu His Thr Asp Met Leu Val 165 170 175 Ile Leu Asn Asp Asn Glu Met Ser Ile Ser Glu Asn Val Gly Ala Leu 180 185 190 Asn Asn His Leu Ala Arg Ile Phe Ser Gly Ser Leu Tyr Ser Thr Leu 195 200 205 Arg Asp Gly Ser Lys Lys Ile Leu Asp Lys Val Pro Pro Ile Lys Asn 210 215 220 Phe Met Lys Lys Thr Glu Glu His Met Lys Gly Val Met Phe Ser Pro 225 230 235 240 Glu Ser Thr Leu Phe Glu Glu Leu Gly Phe Asn Tyr Ile Gly Pro Val 245 250 255 Asp Gly His Asn Ile Asp Glu Leu Val Ala Thr Leu Thr Asn Met Arg 260 265 270 Asn Leu Lys Gly Pro Gln Phe Leu His Ile Lys Thr Lys Lys Gly Lys 275 280 285 Gly Tyr Ala Pro Ala Glu Lys Asp Pro Ile Gly Phe His Gly Val Pro 290 295 300 Lys Phe Asp Pro Ile Ser Gly Glu Leu Pro Lys Asn Asn Ser Lys Pro 305 310 315 320 Thr Tyr Ser Lys Ile Phe Gly Asp Trp Leu Cys Glu Met Ala Glu Lys 325 330 335 Asp Ala Lys Ile Ile Gly Ile Thr Pro Ala Met Arg Glu Gly Ser Gly 340 345 350 Met Val Glu Phe Ser Gln Arg Phe Pro Lys Gln Tyr Phe Asp Val Ala 355 360 365 Ile Ala Glu Gln His Ala Val Thr Phe Ala Thr Gly Leu Ala Ile Gly 370 375 380 Gly Tyr Lys Pro Val Val Ala Ile Tyr Ser Thr Phe Leu Gln Arg Ala 385 390 395 400 Tyr Asp Gln Leu Ile His Asp Val Ala Ile Gln Asn Leu Pro Val Leu 405 410 415 Phe Ala Ile Asp Arg Ala Gly Ile Val Gly Ala Asp Gly Ala Thr His 420 425 430 Gln Gly Ala Phe Asp Ile Ser Phe Met Arg Cys Ile Pro Asn Met Ile 435 440 445 Ile Met Thr Pro Ser Asp Glu Asn Glu Cys Arg Gln Met Leu Tyr Thr 450 455 460 Gly Tyr Gln Cys Gly Lys Pro Ala Ala Val Arg Tyr Pro Arg Gly Asn 465 470 475 480 Ala Val Gly Val Lys Leu Thr Pro Leu Glu Met Leu Pro Ile Gly Lys 485 490 495 Ser Arg Leu Ile Arg Lys Gly Gln Lys Ile Ala Ile Leu Asn Phe Gly 500 505 510 Thr Leu Leu Pro Ser Ala Leu Glu Leu Ser Glu Lys Leu Asn Ala Thr 515 520 525 Val Val Asp Met Arg Phe Val Lys Pro Ile Asp Ile Glu Met Ile Asn 530 535 540 Val Leu Ala Gln Thr His Asp Tyr Leu Val Thr Leu Glu Glu Asn Ala 545 550 555 560 Ile Gln Gly Gly Ala Gly Ser Ala Val Ala Glu Val Leu Asn Ser Ser 565 570 575 Gly Lys Ser Thr Ala Leu Leu Gln Leu Gly Leu Pro Asp Tyr Phe Ile 580 585 590 Pro Gln Ala Thr Gln Gln Glu Ala Leu Ala Asp Leu Gly Leu Asp Thr 595 600 605 Lys Gly Ile Glu Glu Lys Ile Leu Asn Phe Ile Ala Lys Gln Gly Asn 610 615 620 Leu 625 34 1205 PRT Plasmodium falciparum 34 Met Ile Phe Asn Tyr Val Phe Phe Lys Asn Phe Val Pro Val Val Leu 1 5 10 15 Tyr Ile Leu Leu Ile Ile Tyr Ile Asn Leu Asn Gly Met Asn Asn Lys 20 25 30 Asn Gln Ile Lys Thr Glu Lys Ile Tyr Ile Lys Lys Leu Asn Arg Leu 35 40 45 Ser Arg Lys Asn Ser Leu Cys Ser Ser Lys Asn Lys Ile Ala Cys Leu 50 55 60 Phe Asp Ile Gly Asn Asp Asp Asn Arg Asn Thr Thr Tyr Gly Tyr Asn 65 70 75 80 Val Asn Val Lys Asn Asp Asp Ile Asn Ser Leu Leu Lys Asn Asn Tyr 85 90 95 Ser Asn Lys Leu Tyr Met Asp Lys Arg Lys Asn Ile Asn Asn Val Ile 100 105 110 Ser Thr Asn Lys Ile Ser Gly Ser Ile Ser Asn Ile Cys Ser Arg Asn 115 120 125 Gln Lys Glu Asn Glu Gln Lys Arg Asn Lys Gln Arg Cys Leu Thr Gln 130 135 140 Cys His Thr Tyr Asn Met Ser His Glu Gln Asp Lys Leu Ala Asn Asp 145 150 155 160 Asn Asn Arg Asn Asn Lys Lys Asn Phe Asn Leu Leu Phe Ile Asn Tyr 165 170 175 Phe Asn Leu Lys Arg Met Lys Asn Ser Leu Leu Asn Lys Asp Asn Phe 180 185 190 Phe Tyr Cys Lys Glu Lys Lys Leu Ser Phe Leu His Lys Ala Tyr Lys 195 200 205 Lys Lys Asn Cys Thr Phe Gln Asn Tyr Ser Leu Lys Arg Lys Ser Asn 210 215 220 Arg Asp Ser His Lys Leu Phe Ser Gly Glu Phe Asp Asp Tyr Thr Asn 225 230 235 240 Asn Asn Ala Leu Tyr Glu Ser Glu Lys Lys Glu Tyr Ile Thr Leu Asn 245 250 255 Asn Asn Asn Lys Asn Asn Asn Asn Lys Asn Asn Asp Asn Lys Asn Asn 260 265 270 Asp Asn Asn Asp Tyr Asn Asn Asn Asn Ser Cys Asn Asn Leu Gly Glu 275 280 285 Arg Ser Asn His Tyr Asp Asn Tyr Gly Gly Asp Asn Asn Asn Pro Cys 290 295 300 Asn Asn Asn Asn Asp Lys Tyr Asp Ile Gly Lys Tyr Phe Lys Gln Ile 305 310 315 320 Asn Thr Phe Ile Asn Ile Asp Glu Tyr Lys Thr Ile Tyr Gly Asp Glu 325 330 335 Ile Tyr Lys Glu Ile Tyr Glu Leu Tyr Val Glu Arg Asn Ile Pro Glu 340 345 350 Tyr Tyr Glu Arg Lys Tyr Phe Ser Glu Asp Ile Lys Lys Ser Val Leu 355 360 365 Phe Asp Ile Asp Lys Tyr Asn Asp Val Glu Phe Glu Lys Ala Ile Lys 370 375 380 Glu Glu Phe Ile Asn Asn Gly Val Tyr Ile Asn Asn Ile Asp Asn Thr 385 390 395 400 Tyr Tyr Lys Lys Glu Asn Ile Leu Ile Met Lys Lys Ile Leu His Tyr 405 410 415 Phe Pro Leu Leu Lys Leu Ile Asn Asn Pro Ser Asp Leu Lys Lys Leu 420 425 430 Lys Lys Gln Tyr Leu Pro Leu Leu Ala His Glu Leu Lys Ile Phe Leu 435 440 445 Phe Phe Ile Val Asn Ile Thr Gly Gly His Phe Ser Ser Val Leu Ser 450 455 460 Ser Leu Glu Ile Gln Leu Leu Leu Leu Tyr Ile Phe Asn Gln Pro Tyr 465 470 475 480 Asp Asn Val Ile Tyr Asp Ile Gly His Gln Ala Tyr Val His Lys Ile 485 490 495 Leu Thr Gly Arg Lys Leu Leu Phe Leu Ser Leu Arg Asn Lys Lys Gly 500 505 510 Ile Ser Gly Phe Leu Asn Ile Phe Glu Ser Ile Tyr Asp Lys Phe Gly 515 520 525 Ala Gly His Ser Ser Thr Ser Leu Ser Ala Ile Gln Gly Tyr Tyr Glu 530 535 540 Ala Glu Trp Gln Val Lys Asn Lys Glu Lys Tyr Gly Asn Gly Asp Ile 545 550 555 560 Glu Ile Ser Asp Asn Ala Asn Val Thr Asn Asn Glu Arg Ile Phe Gln 565 570 575 Lys Gly Ile His Asn Asp Asn Asn Ile Asn Asn Asn Ile Asn Asn Asn 580 585 590 Asn Tyr Ile Asn Pro Ser Asp Val Val Gly Arg Glu Asn Thr Asn Val 595 600 605 Pro Asn Val Arg Asn Asp Asn His Asn Val Asp Lys Val His Ile Ala 610 615 620 Ile Ile Gly Asp Gly Gly Leu Thr Gly Gly Met Ala Leu Glu Ala Leu 625 630 635 640 Asn Tyr Ile Ser Phe Leu Asn Ser Lys Ile Leu Ile Ile Tyr Asn Asp 645 650 655 Asn Gly Gln Val Ser Leu Pro Thr Asn Ala Val Ser Ile Ser Gly Asn 660 665 670 Arg Pro Ile Gly Ser Ile Ser Asp His Leu His Tyr Phe Val Ser Asn 675 680 685 Ile Glu Ala Asn Ala Gly Asp Asn Lys Leu Ser Lys Asn Ala Lys Glu 690 695 700 Asn Asn Ile Phe Glu Asn Leu Asn Tyr Asp Tyr Ile Gly Val Val Asn 705 710 715 720 Gly Asn Asn Thr Glu Glu Leu Phe Lys Val Leu Asn Asn Ile Lys Glu 725 730 735 Asn Lys Leu Lys Arg Ala Thr Val Leu His Val Arg Thr Lys Lys Ser 740 745 750 Asn Asp Phe Ile Asn Ser Lys Ser Pro Ile Ser Ile Leu His Ser Ile 755 760 765 Lys Lys Asn Glu Ile Phe Pro Phe Asp Thr Thr Ile Leu Asn Gly Asn 770 775 780 Ile His Lys Glu Asn Lys Ile Glu Glu Glu Lys Asn Val Ser Ser Ser 785 790 795 800 Thr Lys Tyr Asp Val Asn Asn Lys Asn Asn Lys Asn Asn Asp Asn Ser 805 810 815 Glu Ile Ile Lys Tyr Glu Asp Met Phe Ser Lys Glu Thr Phe Thr Asp 820 825 830 Ile Tyr Thr Asn Glu Met Leu Lys Tyr Leu Lys Lys Asp Arg Asn Ile 835 840 845 Ile Phe Leu Ser Pro Ala Met Leu Gly Gly Ser Gly Leu Val Lys Ile 850 855 860 Ser Glu Arg Tyr Pro Asn Asn Val Tyr Asp Val Gly Ile Ala Glu Gln 865 870 875 880 His Ser Val Thr Phe Ala Ala Ala Met Ala Met Asn Lys Lys Leu Lys 885 890 895 Ile Gln Leu Cys Ile Tyr Ser Thr Phe Leu Gln Arg Ala Tyr Asp Gln 900 905 910 Ile Ile His Asp Leu Asn Leu Gln Asn Ile Pro Leu Lys Val Ile Ile 915 920 925 Gly Arg Ser Gly Leu Val Gly Glu Asp Gly Ala Thr His Gln Gly Ile 930 935 940 Tyr Asp Leu Ser Tyr Leu Gly Thr Leu Asn Asn Ala Tyr Ile Ile Ser 945 950 955 960 Pro Ser Asn Gln Val Asp Leu Lys Arg Ala Leu Arg Phe Ala Tyr Leu 965 970 975 Asp Lys Asp His Ser Val Tyr Ile Arg Ile Pro Arg Met Asn Ile Leu 980 985 990 Ser Asp Lys Tyr Met Lys Gly Tyr Leu Asn Ile His Met Lys Asn Glu 995 1000 1005 Ser Lys Asn Ile Asp Val Asn Val Asp Ile Asn Asp Asp Val Asp Lys 1010 1015 1020 Tyr Ser Glu Glu Tyr Met Asp Asp Asp Asn Phe Ile Lys Ser Phe Ile 1025 1030 1035 1040 Gly Lys Ser Arg Ile Ile Lys Met Asp Asn Glu Asn Asn Asn Thr Asn 1045 1050 1055 Glu His Tyr Ser Ser Arg Gly Asp Thr Gln Thr Lys Lys Lys Lys Val 1060 1065 1070 Cys Ile Phe Asn Met Gly Ser Met Leu Phe Asn Val Ile Asn Ala Ile 1075 1080 1085 Lys Glu Ile Glu Lys Glu Gln Tyr Ile Ser His Asn Tyr Ser Phe Ser 1090 1095 1100 Ile Val Asp Met Ile Phe Leu Asn Pro Leu Asp Lys Asn Met Ile Asp 1105 1110 1115 1120 His Val Ile Lys Gln Asn Lys His Gln Tyr Leu Ile Thr Tyr Glu Asp 1125 1130 1135 Asn Thr Ile Gly Gly Phe Ser Thr His Phe Asn Asn Tyr Leu Ile Glu 1140 1145 1150 Asn Asn Tyr Ile Thr Lys His Asn Leu Tyr Val His Asn Ile Tyr Leu 1155 1160 1165 Ser Asn Glu Pro Ile Glu His Ala Ser Phe Lys Asp Gln Gln Glu Val 1170 1175 1180 Val Lys Met Asp Lys Cys Ser Leu Val Asn Arg Ile Lys Asn Tyr Leu 1185 1190 1195 1200 Lys Asn Asn Pro Thr 1205 35 631 PRT Streptomyces sp. CL 190 35 Met Thr Ile Leu Glu Asn Ile Arg Gln Pro Arg Asp Leu Lys Ala Leu 1 5 10 15 Pro Glu Glu Gln Leu His Glu Leu Ser Glu Glu Ile Arg Gln Phe Leu 20 25 30 Val His Ala Val Thr Arg Thr Gly Gly His Leu Gly Pro Asn Leu Gly 35 40 45 Val Val Glu Leu Thr Ile Ala Leu His Arg Val Phe Glu Ser Pro Val 50 55 60 Asp Arg Ile Leu Trp Asp Thr Gly His Gln Ser Tyr Val His Lys Leu 65 70 75 80 Leu Thr Gly Arg Gln Asp Phe Ser Lys Leu Arg Gly Lys Gly Gly Leu 85 90 95 Ser Gly Tyr Pro Ser Arg Glu Glu Ser Glu His Asp Val Ile Glu Asn 100 105 110 Ser His Ala Ser Thr Ala Leu Gly Trp Ala Asp Gly Leu Ala Lys Ala 115 120 125 Arg Arg Val Gln Gly Glu Lys Gly His Val Val Ala Val Ile Gly Gly 130 135 140 Arg Ala Leu Thr Gly Gly Met Ala Trp Glu Ala Leu Asn Asn Ile Ala 145 150 155 160 Ala Ala Lys Asp Gln Pro Leu Ile Ile Val Val Asn Asp Asn Glu Arg 165 170 175 Ser Tyr Ala Pro Thr Ile Gly Gly Leu Ala Asn His Leu Ala Thr Leu 180 185 190 Arg Thr Thr Asp Gly Tyr Glu Lys Val Leu Ala Trp Gly Lys Asp Val 195 200 205 Leu Leu Arg Thr Pro Ile Val Gly His Pro Leu Tyr Glu Ala Leu His 210 215 220 Gly Ala Lys Lys Gly Phe Lys Asp Ala Phe Ala Pro Gln Gly Met Phe 225 230 235 240 Glu Asp Leu Gly Leu Lys Tyr Val Gly Pro Ile Asp Gly His Asp Ile 245 250 255 Gly Ala Val Glu Ser Ala Leu Arg Arg Ala Lys Arg Phe His Gly Pro 260 265 270 Val Leu Val His Cys Leu Thr Val Lys Gly Arg Gly Tyr Glu Pro Ala 275 280 285 Leu Ala His Glu Glu Asp His Phe His Thr Val Gly Val Met Asp Pro 290 295 300 Leu Thr Cys Glu Pro Leu Ser Pro Thr Asp Gly Pro Ser Trp Thr Ser 305 310 315 320 Val Phe Gly Asp Glu Ile Val Arg Ile Gly Ala Glu Arg Glu Asp Ile 325 330 335 Val Ala Ile Thr Ala Ala Met Leu His Pro Val Gly Leu Ala Arg Phe 340 345 350 Ala Asp Arg Phe Pro Asp Arg Val Trp Asp Val Gly Ile Ala Glu Gln 355 360 365 His Ala Ala Val Ser Ala Ala Gly Leu Ala Thr Gly Gly Leu His Pro 370 375 380 Val Val Ala Val Tyr Ala Thr Phe Leu Asn Arg Ala Phe Asp Gln Leu 385 390 395 400 Leu Met Asp Val Ala Leu His Arg Cys Gly Val Thr Phe Val Leu Asp 405 410 415 Arg Ala Gly Val Thr Gly Val Asp Gly Ala Ser His Asn Gly Met Trp 420 425 430 Asp Met Ser Val Leu Gln Val Val Pro Gly Leu Arg Ile Ala Ala Pro 435 440 445 Arg Asp Ala Asp His Val Arg Ala Gln Leu Arg Glu Ala Val Ala Val 450 455 460 Asp Asp Ala Pro Thr Leu Ile Arg Phe Pro Lys Glu Ser Val Gly Pro 465 470 475 480 Arg Ile Pro Ala Leu Asp Arg Val Gly Gly Leu Asp Val Leu His Arg 485 490 495 Asp Glu Arg Pro Glu Val Leu Leu Val Ala Val Gly Val Met Ala Gln 500 505 510 Val Cys Leu Gln Thr Ala Glu Leu Leu Arg Ala Arg Gly Ile Gly Cys 515 520 525 Thr Val Val Asp Pro Arg Trp Val Lys Pro Val Asp Pro Val Leu Pro 530 535 540 Pro Leu Ala Ala Glu His Arg Leu Val Ala Val Val Glu Asp Asn Ser 545 550 555 560 Arg Ala Ala Gly Val Gly Ser Ala Val Ala Leu Ala Leu Gly Asp Ala 565 570 575 Asp Val Asp Val Pro Val Arg Arg Phe Gly Ile Pro Glu Gln Phe Leu 580 585 590 Ala His Ala Arg Arg Gly Glu Val Leu Ala Asp Ile Gly Leu Thr Pro 595 600 605 Val Glu Ile Ala Gly Arg Ile Gly Ala Ser Leu Pro Val Arg Glu Glu 610 615 620 Pro Ala Glu Glu Gln Pro Ala 625 630 36 618 PRT Helicobacter pylori 36 Met Ile Leu Gln Asn Lys Thr Phe Asp Leu Asn Pro Asn Asp Ile Ala 1 5 10 15 Gly Leu Glu Leu Val Cys Gln Thr Leu Arg Asn Arg Ile Leu Glu Val 20 25 30 Val Ser Ala Asn Gly Gly His Leu Ser Ser Ser Leu Gly Ala Val Glu 35 40 45 Leu Ile Val Gly Met His Ala Leu Phe Asp Cys Gln Lys Asn Pro Phe 50 55 60 Ile Phe Asp Thr Ser His Gln Ala Tyr Ala His Lys Leu Leu Thr Gly 65 70 75 80 Arg Phe Glu Ser Phe Ser Thr Leu Arg Gln Phe Lys Gly Leu Ser Gly 85 90 95 Phe Thr Lys Pro Ser Glu Ser Ala Tyr Asp Tyr Phe Ile Ala Gly His 100 105 110 Ser Ser Thr Ser Val Ser Ile Gly Val Gly Val Ala Lys Ala Phe Cys 115 120 125 Leu Lys Gln Ala Leu Gly Met Pro Ile Ala Leu Leu Gly Asp Gly Ser 130 135 140 Ile Ser Ala Gly Ile Phe Tyr Glu Ala Leu Asn Glu Leu Gly Asp Arg 145 150 155 160 Lys Tyr Pro Met Ile Met Ile Leu Asn Asp Asn Glu Met Ser Ile Ser 165 170 175 Thr Pro Ile Gly Ala Leu Ser Lys Ala Leu Ser Gln Leu Met Lys Gly 180 185 190 Pro Phe Tyr Gln Ser Phe Arg Ser Lys Val Lys Lys Ile Leu Ser Thr 195 200 205 Leu Pro Glu Ser Val Asn Tyr Leu Ala Ser Arg Phe Glu Glu Ser Phe 210 215 220 Lys Leu Ile Thr Pro Gly Val Phe Phe Glu Glu Leu Gly Ile Asn Tyr 225 230 235 240 Ile Gly Pro Ile Asn Gly His Asp Leu Ser Ala Ile Ile Glu Thr Leu 245 250 255 Lys Leu Ala Lys Glu Leu Lys Glu Pro Val Leu Ile His Ala Gln Thr 260 265 270 Leu Lys Gly Lys Gly Tyr Lys Ile Ala Glu Gly Arg Tyr Glu Lys Trp 275 280 285 His Gly Val Gly Pro Phe Asp Leu Asp Thr Gly Leu Ser Lys Lys Ser 290 295 300 Lys Ser Ala Ile Leu Ser Pro Thr Glu Ala Tyr Ser Asn Thr Leu Leu 305 310 315 320 Glu Leu Ala Lys Lys Asp Glu Lys Ile Val Gly Val Thr Ala Ala Met 325 330 335 Pro Ser Gly Thr Gly Leu Asp Lys Leu Ile Asp Ala Tyr Pro Leu Arg 340 345 350 Phe Phe Asp Val Ala Ile Ala Glu Gln His Ala Leu Thr Ser Ser Ser 355 360 365 Ala Met Ala Lys Glu Gly Phe Lys Pro Phe Val Ser Ile Tyr Ser Thr 370 375 380 Phe Leu Gln Arg Ala Tyr Asp Ser Ile Val His Asp Ala Cys Ile Ser 385 390 395 400 Ser Leu Pro Ile Lys Leu Ala Ile Asp Arg Ala Gly Ile Val Gly Glu 405 410 415 Asp Gly Glu Thr His Gln Gly Leu Leu Asp Val Ser Tyr Leu Arg Ser 420 425 430 Ile Pro Asn Met Val Ile Phe Ala Pro Arg Asp Asn Glu Thr Leu Lys 435 440 445 Asn Ala Val Arg Phe Ala Asn Glu His Asp Ser Ser Pro Cys Ala Phe 450 455 460 Arg Tyr Pro Arg Gly Ser Phe Ala Leu Lys Glu Gly Val Phe Glu Pro 465 470 475 480 Ser Gly Phe Val Leu Gly Gln Ser Glu Leu Leu Lys Lys Glu Gly Glu 485 490 495 Ile Leu Leu Ile Gly Tyr Gly Asn Gly Val Gly Arg Ala His Leu Val 500 505 510 Gln Leu Ala Leu Lys Glu Lys Asn Ile Glu Cys Ala Leu Leu Asp Leu 515 520 525 Arg Phe Leu Lys Pro Leu Asp Pro Asn Leu Ser Ala Ile Val Ala Pro 530 535 540 Tyr Gln Lys Leu Tyr Val Phe Ser Asp Asn Tyr Lys Leu Gly Gly Val 545 550 555 560 Ala Ser Ala Ile Leu Glu Phe Leu Ser Glu Gln Asn Ile Leu Lys Pro 565 570 575 Val Lys Ser Phe Glu Ile Ile Asp Glu Phe Ile Met His Gly Asn Thr 580 585 590 Ala Leu Val Glu Lys Ser Leu Gly Leu Asp Thr Glu Ser Leu Thr Asp 595 600 605 Ala Ile Leu Lys Asp Leu Gly Gln Glu Arg 610 615 37 1990 DNA Rhodobacter sphaeroides 37 cgacggcccg gtagccccgg cgcggctgca gcaccgtcag acgtccgccg agaaagccgt 60 cggaagtcaa ttcgtccggg gcgaacatca gggggtcgtc gggatgccgt tgtcggacat 120 cacccggcag gcgcgatccc agtcttcttc cgggacaaac agacgccgcg gcaatatgcc 180 gatggagcct tcgaggacgc tcatgtggac gtccaccgga aaggcgtcta tatcctcgcc 240 ctgaaggagc gcggtggcga aggcgatgat cgtcgggtcg gtcgtgcgca acagttcctt 300 catgtcgggg acattgtcgg caacgcctcg gtttgtcgag gccggttcgt cgaccgggtg 360 gcaggatcgg gatgggattg gacgaggttt cgcaaaagcc gcatgaacgg ctcgccgcgt 420 ggctggccga ggacatggcc gccgtcaacg ggctgatccg cgagcggatg gcctcgaaac 480 acgcgccccg cattcccgag gtcacggcgc atctggtcga ggccggcggc aagcggctgc 540 ggccgctcct gacgctcgcc gcggcgcggc tgtgcggcta cgaggggccc tatcacatcc 600 atctggccgc gacggtggag ttcatccaca cggcgacgct gcttcacgac gatgtggtgg 660 acgaaagcca ccgccgccgc ggcaaaccca cggcgaacct gctgtgggac aacaaatcct 720 cggtgctggt gggcgactat ctcttcgccc gcagcttcca gctgatggtc gagaccggct 780 cgcttcgcgt gatggacatc ctcgccaatg cctcggccac catctccgag ggcgaggtgc 840 tgcagctgac cgcggcccag gatctgcgca cgaccgagga catccacctg caggtggtgc 900 gcggcaagac ggccgcgctc tttgccgcgg caaccgaggt gggcggcgtg gtcgcgggcg 960 tgcccgaggc gcaggtcgag gcgctccacg cctacgggga cgcgctgggg atcgccttcc 1020 agatcgtcga cgacctcctc gattatggcg gcgtggatgc ccagatcggc aagaacaccg 1080 gcgacgactt ccgcgaacgc aagctgacgc tgccggtcat caaggcggtg gcccaggccg 1140 atgccgagga gcgcgccttc tggcagcggg tgatcgagaa gggcgaccag cgcgagggtg 1200 acctcgagca agcccatgcg atcatgtccc gccacggcgc catggaggcc gcccggcagg 1260 atgcgctccg ctgggtcacg gtggcgcgcg aggcactcgg ccagctgccg gagcacccgc 1320 tgcgcgagat gctgcacgat ctggccgatt tcgtggtcga acgcatcgcc tgatcccttc 1380 cgggcgctct gccccggcgc agcgcaggat cccgcgctgc gcccctttcg gccttccgac 1440 agtccctctg ccgcgggagg ccggcctcgc ctgagaagcc gcactggccg ccggtcttcc 1500 cccgaaccgc tcccgggcct gctcggaagg cgtccgccgc aaaagccccc gcgggggggc 1560 cccaccggcg gccatcagga agagaccgtt gaagcggccc gctcgaatcc tgtcgcgccc 1620 ccccccgacc gggcggctct ccgatccgtg ttcgctcggc gatggacagc cgttccctgt 1680 ccgttcatga tggcgccatg cagaccctta ccgttcccga ttccggcctc gccccctcct 1740 gcccggccaa aggctcgccc gcggcgtctg ccgccatctg cgcagccatg atttcgtctc 1800 ggtggtcgaa ctcgtgcccg cgcccggcct cagggtcgac gtgatggcgc tggggcccaa 1860 gggcgagatc tgggtggtgg aatgcaaatc ctcgcgcgcg gactatcagt ccgaccgcaa 1920 gtggcagggc tatctcgact ggtgcgaccg cttcttcttc gcggtggacg aggaccagcc 1980 cgggccgtcg 1990 38 1002 DNA Rhodobacter sphaeroides 38 atgggattgg acgaggtttc gcaaaagccg catgaacggc tcgccgcgtg gctggccgag 60 gacatggccg ccgtcaacgg gctgatccgc gagcggatgg cctcgaaaca cgcgccccgc 120 attcccgagg tcacggcgca tctggtcgag gccggcggca agcggctgcg gccgctcctg 180 acgctcgccg cggcgcggct gtgcggctac gaggggccct atcacatcca tctggccgcg 240 acggtggagt tcatccacac ggcgacgctg cttcacgacg atgtggtgga cgaaagccac 300 cgccgccgcg gcaaacccac ggcgaacctg ctgtgggaca acaaatcctc ggtgctggtg 360 ggcgactatc tcttcgcccg cagcttccag ctgatggtcg agaccggctc gcttcgcgtg 420 atggacatcc tcgccaatgc ctcggccacc atctccgagg gcgaggtgct gcagctgacc 480 gcggcccagg atctgcgcac gaccgaggac atccacctgc aggtggtgcg cggcaagacg 540 gccgcgctct ttgccgcggc aaccgaggtg ggcggcgtgg tcgcgggcgt gcccgaggcg 600 caggtcgagg cgctccacgc ctacggggac gcgctgggga tcgccttcca gatcgtcgac 660 gacctcctcg attatggcgg cgtggatgcc cagatcggca agaacaccgg cgacgacttc 720 cgcgaacgca agctgacgct gccggtcatc aaggcggtgg cccaggccga tgccgaggag 780 cgcgccttct ggcagcgggt gatcgagaag ggcgaccagc gcgagggtga cctcgagcaa 840 gcccatgcga tcatgtcccg ccacggcgcc atggaggccg cccggcagga tgcgctccgc 900 tgggtcacgg tggcgcgcga ggcactcggc cagctgccgg agcacccgct gcgcgagatg 960 ctgcacgatc tggccgattt cgtggtcgaa cgcatcgcct ga 1002 39 333 PRT Rhodobacter sphaeroides 39 Met Gly Leu Asp Glu Val Ser Gln Lys Pro His Glu Arg Leu Ala Ala 1 5 10 15 Trp Leu Ala Glu Asp Met Ala Ala Val Asn Gly Leu Ile Arg Glu Arg 20 25 30 Met Ala Ser Lys His Ala Pro Arg Ile Pro Glu Val Thr Ala His Leu 35 40 45 Val Glu Ala Gly Gly Lys Arg Leu Arg Pro Leu Leu Thr Leu Ala Ala 50 55 60 Ala Arg Leu Cys Gly Tyr Glu Gly Pro Tyr His Ile His Leu Ala Ala 65 70 75 80 Thr Val Glu Phe Ile His Thr Ala Thr Leu Leu His Asp Asp Val Val 85 90 95 Asp Glu Ser His Arg Arg Arg Gly Lys Pro Thr Ala Asn Leu Leu Trp 100 105 110 Asp Asn Lys Ser Ser Val Leu Val Gly Asp Tyr Leu Phe Ala Arg Ser 115 120 125 Phe Gln Leu Met Val Glu Thr Gly Ser Leu Arg Val Met Asp Ile Leu 130 135 140 Ala Asn Ala Ser Ala Thr Ile Ser Glu Gly Glu Val Leu Gln Leu Thr 145 150 155 160 Ala Ala Gln Asp Leu Arg Thr Thr Glu Asp Ile His Leu Gln Val Val 165 170 175 Arg Gly Lys Thr Ala Ala Leu Phe Ala Ala Ala Thr Glu Val Gly Gly 180 185 190 Val Val Ala Gly Val Pro Glu Ala Gln Val Glu Ala Leu His Ala Tyr 195 200 205 Gly Asp Ala Leu Gly Ile Ala Phe Gln Ile Val Asp Asp Leu Leu Asp 210 215 220 Tyr Gly Gly Val Asp Ala Gln Ile Gly Lys Asn Thr Gly Asp Asp Phe 225 230 235 240 Arg Glu Arg Lys Leu Thr Leu Pro Val Ile Lys Ala Val Ala Gln Ala 245 250 255 Asp Ala Glu Glu Arg Ala Phe Trp Gln Arg Val Ile Glu Lys Gly Asp 260 265 270 Gln Arg Glu Gly Asp Leu Glu Gln Ala His Ala Ile Met Ser Arg His 275 280 285 Gly Ala Met Glu Ala Ala Arg Gln Asp Ala Leu Arg Trp Val Thr Val 290 295 300 Ala Arg Glu Ala Leu Gly Gln Leu Pro Glu His Pro Leu Arg Glu Met 305 310 315 320 Leu His Asp Leu Ala Asp Phe Val Val Glu Arg Ile Ala 325 330 40 1833 DNA Sphingomonas trueperi 40 ggatcgcgca gcgcctcggc cacgcgcacc atcagcagca gattgccgtt cggcagccgc 60 gcgaagccgg ggttgaaggc gccaaggaca taggtcgcgt cgtccacccc ctcgcgcagc 120 ggtgagcggg tcaggtcgac attgtcgggc cggaagatca gataatcgtc gctcaagcgc 180 ttgccccctc gggtttcacg cccagcaacg gggtcaggcc ccgggggttc cggcttcagc 240 gccggcttcc tgggcctggc ggtggtgccg gatcacctcg tcgatgatga agcgcaggaa 300 tttctcggaa aattcggggt cgagatcggc atcctgcgcc agcgcgcgca gccgggcgat 360 ctgcgcctcc tcgcggccgg gatcggcggg cggcagcccg gattcggcct tgtagcgccc 420 caccgcctgg gtcaccttga accgctcggc gagcatgaag acgagcgccg catcgatatt 480 gtcgatgctc tggcgatagc gggtcagcgt cgcgtcggtc atgcgaatct cctttgccgc 540 tgcggcacgg ccatgcaagc acctcttgcc tttgcaatgc acaaaggcca gaggctcgtt 600 gcatatgagc gcaaccgtcc accgcctggg ctcgcgaacc cagccttcgc tcgatccgat 660 catggcgctg gtcgcccagg acatgaacct ggtgaacgcg gtgatcctcg atcgcatgca 720 gtccgagatc ccgctgatcc ccgaactcgc cggccatctg atcgctggcg gcggcaagcg 780 gatgcggccg atgctgacgc tcgccagcgc ccggctgctc ggctattcgg gcacgcgcca 840 ccacaagctg gcggcggcag tggagttcat ccacaccgcg acgctgctgc atgacgacgt 900 ggtcgacagc tcggacctgc gccgcggccg ccgcaccgcc aacatcatct ggggcaatcc 960 cgccagcgtg ctggtcggcg acttcctgtt cagccgctcg ttcgagctga tggtcgaggc 1020 cgaaagcctc aaggcgctgc acatcctgtc gaacgccagc gcggtgatcg ccgagggcga 1080 agtcaaccag ctgaccgcgg tgcgccggat cgacctgtcc gaggatcgct atctcgacat 1140 catcggcgcc aagactgcgg cgctgttcgc cgccgcctgc cgggtggcgg gcgtggtcgc 1200 cgagcgtccc gaggcggagg aactcgcgct cgacgcctat ggccgcaacc tcggcatcgc 1260 tttccagctg gtcgacgacg cgatcgacta tgtctcggac gcgtcgacga tgggcaagga 1320 tgccggcgac gatttccgcg aaggcaagat gacgctgccg gtggtcctgg cgtacgcgcg 1380 cggcgacgag gcggaacgcg gcttctggaa ggaagcgatt tcgggccgcc gcatctcgga 1440 cgaggatttc gccgaggcga tccggctggt gcagagctgc cgcgcggtgg acgacacgct 1500 cgcccgtgcc cgccattacg gccagctcgc gatcgatgcg ctgggcggct tccgcgcctg 1560 cgaggcgaag gacgcgatgg tcgaggcggt cgaattcgcg gtggcgcgcg cctactgacg 1620 cgcgccgacc ggagcatttc cgggtggatc gcttgcgatc caaggctcgg gaaatgcgac 1680 catcaaaaag cttccgggga ttacgcctcg gtcgactttt cttcgccctc gtcctcgtcg 1740 acttcgagcg cgtcttcctc gtccatgtcg agcactacct cgatgccctc gacgatcagg 1800 tcgagctgct cgtagctcgc cgtcatctcg atc 1833 41 1014 DNA Sphingomonas trueperi 41 atgagcgcaa ccgtccaccg cctgggctcg cgaacccagc cttcgctcga tccgatcatg 60 gcgctggtcg cccaggacat gaacctggtg aacgcggtga tcctcgatcg catgcagtcc 120 gagatcccgc tgatccccga actcgccggc catctgatcg ctggcggcgg caagcggatg 180 cggccgatgc tgacgctcgc cagcgcccgg ctgctcggct attcgggcac gcgccaccac 240 aagctggcgg cggcagtgga gttcatccac accgcgacgc tgctgcatga cgacgtggtc 300 gacagctcgg acctgcgccg cggccgccgc accgccaaca tcatctgggg caatcccgcc 360 agcgtgctgg tcggcgactt cctgttcagc cgctcgttcg agctgatggt cgaggccgaa 420 agcctcaagg cgctgcacat cctgtcgaac gccagcgcgg tgatcgccga gggcgaagtc 480 aaccagctga ccgcggtgcg ccggatcgac ctgtccgagg atcgctatct cgacatcatc 540 ggcgccaaga ctgcggcgct gttcgccgcc gcctgccggg tggcgggcgt ggtcgccgag 600 cgtcccgagg cggaggaact cgcgctcgac gcctatggcc gcaacctcgg catcgctttc 660 cagctggtcg acgacgcgat cgactatgtc tcggacgcgt cgacgatggg caaggatgcc 720 ggcgacgatt tccgcgaagg caagatgacg ctgccggtgg tcctggcgta cgcgcgcggc 780 gacgaggcgg aacgcggctt ctggaaggaa gcgatttcgg gccgccgcat ctcggacgag 840 gatttcgccg aggcgatccg gctggtgcag agctgccgcg cggtggacga cacgctcgcc 900 cgtgcccgcc attacggcca gctcgcgatc gatgcgctgg gcggcttccg cgcctgcgag 960 gcgaaggacg cgatggtcga ggcggtcgaa ttcgcggtgg cgcgcgccta ctga 1014 42 337 PRT Sphingomonas trueperi 42 Met Ser Ala Thr Val His Arg Leu Gly Ser Arg Thr Gln Pro Ser Leu 1 5 10 15 Asp Pro Ile Met Ala Leu Val Ala Gln Asp Met Asn Leu Val Asn Ala 20 25 30 Val Ile Leu Asp Arg Met Gln Ser Glu Ile Pro Leu Ile Pro Glu Leu 35 40 45 Ala Gly His Leu Ile Ala Gly Gly Gly Lys Arg Met Arg Pro Met Leu 50 55 60 Thr Leu Ala Ser Ala Arg Leu Leu Gly Tyr Ser Gly Thr Arg His His 65 70 75 80 Lys Leu Ala Ala Ala Val Glu Phe Ile His Thr Ala Thr Leu Leu His 85 90 95 Asp Asp Val Val Asp Ser Ser Asp Leu Arg Arg Gly Arg Arg Thr Ala 100 105 110 Asn Ile Ile Trp Gly Asn Pro Ala Ser Val Leu Val Gly Asp Phe Leu 115 120 125 Phe Ser Arg Ser Phe Glu Leu Met Val Glu Ala Glu Ser Leu Lys Ala 130 135 140 Leu His Ile Leu Ser Asn Ala Ser Ala Val Ile Ala Glu Gly Glu Val 145 150 155 160 Asn Gln Leu Thr Ala Val Arg Arg Ile Asp Leu Ser Glu Asp Arg Tyr 165 170 175 Leu Asp Ile Ile Gly Ala Lys Thr Ala Ala Leu Phe Ala Ala Ala Cys 180 185 190 Arg Val Ala Gly Val Val Ala Glu Arg Pro Glu Ala Glu Glu Leu Ala 195 200 205 Leu Asp Ala Tyr Gly Arg Asn Leu Gly Ile Ala Phe Gln Leu Val Asp 210 215 220 Asp Ala Ile Asp Tyr Val Ser Asp Ala Ser Thr Met Gly Lys Asp Ala 225 230 235 240 Gly Asp Asp Phe Arg Glu Gly Lys Met Thr Leu Pro Val Val Leu Ala 245 250 255 Tyr Ala Arg Gly Asp Glu Ala Glu Arg Gly Phe Trp Lys Glu Ala Ile 260 265 270 Ser Gly Arg Arg Ile Ser Asp Glu Asp Phe Ala Glu Ala Ile Arg Leu 275 280 285 Val Gln Ser Cys Arg Ala Val Asp Asp Thr Leu Ala Arg Ala Arg His 290 295 300 Tyr Gly Gln Leu Ala Ile Asp Ala Leu Gly Gly Phe Arg Ala Cys Glu 305 310 315 320 Ala Lys Asp Ala Met Val Glu Ala Val Glu Phe Ala Val Ala Arg Ala 325 330 335 Tyr 43 1137 DNA Schizosaccharomyces pombe 43 atgattcagt atgtatattt aaaacatatg aggaaattat ggagtcttgg aaaagtccgt 60 tcgactgttc ttcggttttc tactacgaac cgcaatgctt cacatttaat taaaaacgag 120 ttggaacaaa tctcaccagg gattcgtcaa atgctgaatt caaattcaga atttcttgaa 180 gagtgttcta aatattatac cattgctcaa ggaaaacaaa tgcgtccttc tcttgttttg 240 ctgatgtcca aagctacaag cttgtgccat ggtattgatc ggtccgtagt gggcgacaaa 300 tatattgatg atgatgattt aagatcattt tcgacgggtc aaattcttcc ttctcaattg 360 agattagcac aaataaccga gatgatccat atagcaagtt tgctgcatga cgatgtgatt 420 gatcacgcta atgtccgtag aggctcacct tcaagcaatg ttgctttcgg taatcgacgg 480 tcaatccttg cgggtaattt catccttgca cgggcttcga ctgctatggc ccgccttcga 540 aatccccaag ttacggagtt gttagctaca gtgatagcag acttggttcg aggtgagttt 600 ttgcagctaa aaaatactat ggatccttca tctttggaaa taaaacaatc aaattttgac 660 tattatattg aaaaaagttt tttgaaaaca gccagtttaa tttccaaaag ctgcaaggct 720 tctacaatcc tcggacaatg ttctcctact gtagcaacag ctgctggaga atacggtcga 780 tgcattggta ctgcttttca actaatggat gacgtgttgg actatacgtc gaaagatgat 840 actttaggaa aggcggctgg tgcagatttg aagctagggt tggctacagc tcccgtcctc 900 tttgcatgga aaaagtatcc agaacttggt gcaatgattg tgaatagatt caatcatcct 960 tctgatatcc aacgggctcg ttctttggtt gagtgcactg atgctatcga gcaaaccatc 1020 acttgggcaa aagaatatat caaaaaagcc aaagattccc ttctgtgtct ccctgattca 1080 cctgcaagga aggcactttt tgcgttggct gataaagtaa taacgagaaa gaagtga 1137 44 948 DNA Gluconobacter suboxydans 44 atgctggcct gcaaccgggc gatcatcgcc cggatggaaa gtccggttcc cctgatcccg 60 cagcttggcg cccatcttgt cgcggcggga ggcaagcgcc ttcgcccgct gctgacgctg 120 gcctccgcac gtctgtgcgg ttatcagccg ggtccggacc atcagcgtca tgtcgggctc 180 gccgcctgcg ttgagttcat tcataccgcc acactgctgc atgatgatgt cgtggatgag 240 agcacgttgc gtcgggggct ggcttcggcc aatgccgtgt tcggcaacaa ggcgtccgtg 300 ctggtaggtg acttcctgtt cgcccgctcg ttccagctta tgacagcaga cggctccctg 360 aaggtcatgg cgatcctgtc ggatgcatcg gcgacaattg ctgaaggtga agtccttcag 420 atggtcgtgc agaacgacct tacgacgcct gtagaacgct atcttgaagt cattcacggc 480 aagacggctg cgctgtttgc ggctgcctgc cgtgtcggcg ctgtcgtggc cgagcgtccg 540 gaagcagaag aggaagctct ggagcggttt ggcaccaatc tgggtatggc gttccagctt 600 gttgatgatg ccctggatta tgccgcagac cagcaggttt tgggcaagac cgttggtgat 660 gacatgcgtg aaggcaagat caccctgccg gtcctggccg cctatgaggc tggctcgccg 720 gaagatcgta ttttctggga gcgcgtcatt ggagaagggg agcagactga ggacgatctg 780 cctcatgctc tgaacctgat tgcaaagacg ggtgcgatca atacgacgat cgcccgcgcg 840 caggtctatg ccgacgcagc tgttgaagcc ctgtccattt tcccggatag cgaactgcgc 900 cgccttctga tcgaaacggt tcagttcacg gtgaatcggg cccgctaa 948 45 978 DNA Rhodobacter capsulatus 45 atggccatcg atttcaagca agatattctc gctcctgttg ctcaagattt tgcagcgatg 60 gaccagttta ttaatgaagg aatcagctcc aaggtcgcac tggtcatgtc agtcagcaag 120 catgtcgttg aagcaggtgg aaagcgcatg cgtccgatta tgtgcttgct ggccgcttat 180 gcctgtggtg aaaccaattt aaagcatgca cagaagctgg cggccattat tgaaatgctg 240 catacggcga ctctggtaca tgatgatgat gtagatgagt ctggcttacg ccgtggcaga 300 ccaacagcaa atgcgacatg gaataaccag actgcggtac tggtggggga ttttctgatt 360 gcccgggcat ttgatctgct ggttgatctg gacaatatga tcctgttaaa ggacttctct 420 acaggaacct gtgagattgc tgagggtgaa gtattgcagt tgcaggcaca gcatcagcca 480 gatacaacag aagatattta tttacagatt attcacggta aaacctcacg gttgttcgaa 540 ctggcgaccg aaggcgctgc aatactggca ggcaaacctg aataccgtga acctttacgt 600 cgttttgccg gacactttgg caatgctttt cagattattg atgatattct ggattacact 660 tcagatgctg atacgctcgg caaaaatatt ggcgatgact tgatggaagg caaacccacc 720 ctgccgctga ttgcagcaat gcaaaatact caaggtgaac agcgcgacct gatccgtcgc 780 agcattgcca ctggcggtac ttcacagctt gaacaagtta ttgcgattgt acaaaattcg 840 ggagcgctgg attattgcca taagcgtgct actgaagaaa ccgagcgagc attacaggca 900 ctagaaatat tacctgagag tacttaccgg caggcgctgg ttaacttgac ccgcttagct 960 ttagaccgaa tccaataa 978 46 315 PRT Gluconobacter suboxydans 46 Met Leu Ala Cys Asn Arg Ala Ile Ile Ala Arg Met Glu Ser Pro Val 1 5 10 15 Pro Leu Ile Pro Gln Leu Gly Ala His Leu Val Ala Ala Gly Gly Lys 20 25 30 Arg Leu Arg Pro Leu Leu Thr Leu Ala Ser Ala Arg Leu Cys Gly Tyr 35 40 45 Gln Pro Gly Pro Asp His Gln Arg His Val Gly Leu Ala Ala Cys Val 50 55 60 Glu Phe Ile His Thr Ala Thr Leu Leu His Asp Asp Val Val Asp Glu 65 70 75 80 Ser Thr Leu Arg Arg Gly Leu Ala Ser Ala Asn Ala Val Phe Gly Asn 85 90 95 Lys Ala Ser Val Leu Val Gly Asp Phe Leu Phe Ala Arg Ser Phe Gln 100 105 110 Leu Met Thr Ala Asp Gly Ser Leu Lys Val Met Ala Ile Leu Ser Asp 115 120 125 Ala Ser Ala Thr Ile Ala Glu Gly Glu Val Leu Gln Met Val Val Gln 130 135 140 Asn Asp Leu Thr Thr Pro Val Glu Arg Tyr Leu Glu Val Ile His Gly 145 150 155 160 Lys Thr Ala Ala Leu Phe Ala Ala Ala Cys Arg Val Gly Ala Val Val 165 170 175 Ala Glu Arg Pro Glu Ala Glu Glu Glu Ala Leu Glu Arg Phe Gly Thr 180 185 190 Asn Leu Gly Met Ala Phe Gln Leu Val Asp Asp Ala Leu Asp Tyr Ala 195 200 205 Ala Asp Gln Gln Val Leu Gly Lys Thr Val Gly Asp Asp Met Arg Glu 210 215 220 Gly Lys Ile Thr Leu Pro Val Leu Ala Ala Tyr Glu Ala Gly Ser Pro 225 230 235 240 Glu Asp Arg Ile Phe Trp Glu Arg Val Ile Gly Glu Gly Glu Gln Thr 245 250 255 Glu Asp Asp Leu Pro His Ala Leu Asn Leu Ile Ala Lys Thr Gly Ala 260 265 270 Ile Asn Thr Thr Ile Ala Arg Ala Gln Val Tyr Ala Asp Ala Ala Val 275 280 285 Glu Ala Leu Ser Ile Phe Pro Asp Ser Glu Leu Arg Arg Leu Leu Ile 290 295 300 Glu Thr Val Gln Phe Thr Val Asn Arg Ala Arg 305 310 315 47 378 PRT Schizosaccharomyces pombe 47 Met Ile Gln Tyr Val Tyr Leu Lys His Met Arg Lys Leu Trp Ser Leu 1 5 10 15 Gly Lys Val Arg Ser Thr Val Leu Arg Phe Ser Thr Thr Asn Arg Asn 20 25 30 Ala Ser His Leu Ile Lys Asn Glu Leu Glu Gln Ile Ser Pro Gly Ile 35 40 45 Arg Gln Met Leu Asn Ser Asn Ser Glu Phe Leu Glu Glu Cys Ser Lys 50 55 60 Tyr Tyr Thr Ile Ala Gln Gly Lys Gln Met Arg Pro Ser Leu Val Leu 65 70 75 80 Leu Met Ser Lys Ala Thr Ser Leu Cys His Gly Ile Asp Arg Ser Val 85 90 95 Val Gly Asp Lys Tyr Ile Asp Asp Asp Asp Leu Arg Ser Phe Ser Thr 100 105 110 Gly Gln Ile Leu Pro Ser Gln Leu Arg Leu Ala Gln Ile Thr Glu Met 115 120 125 Ile His Ile Ala Ser Leu Leu His Asp Asp Val Ile Asp His Ala Asn 130 135 140 Val Arg Arg Gly Ser Pro Ser Ser Asn Val Ala Phe Gly Asn Arg Arg 145 150 155 160 Ser Ile Leu Ala Gly Asn Phe Ile Leu Ala Arg Ala Ser Thr Ala Met 165 170 175 Ala Arg Leu Arg Asn Pro Gln Val Thr Glu Leu Leu Ala Thr Val Ile 180 185 190 Ala Asp Leu Val Arg Gly Glu Phe Leu Gln Leu Lys Asn Thr Met Asp 195 200 205 Pro Ser Ser Leu Glu Ile Lys Gln Ser Asn Phe Asp Tyr Tyr Ile Glu 210 215 220 Lys Ser Phe Leu Lys Thr Ala Ser Leu Ile Ser Lys Ser Cys Lys Ala 225 230 235 240 Ser Thr Ile Leu Gly Gln Cys Ser Pro Thr Val Ala Thr Ala Ala Gly 245 250 255 Glu Tyr Gly Arg Cys Ile Gly Thr Ala Phe Gln Leu Met Asp Asp Val 260 265 270 Leu Asp Tyr Thr Ser Lys Asp Asp Thr Leu Gly Lys Ala Ala Gly Ala 275 280 285 Asp Leu Lys Leu Gly Leu Ala Thr Ala Pro Val Leu Phe Ala Trp Lys 290 295 300 Lys Tyr Pro Glu Leu Gly Ala Met Ile Val Asn Arg Phe Asn His Pro 305 310 315 320 Ser Asp Ile Gln Arg Ala Arg Ser Leu Val Glu Cys Thr Asp Ala Ile 325 330 335 Glu Gln Thr Ile Thr Trp Ala Lys Glu Tyr Ile Lys Lys Ala Lys Asp 340 345 350 Ser Leu Leu Cys Leu Pro Asp Ser Pro Ala Arg Lys Ala Leu Phe Ala 355 360 365 Leu Ala Asp Lys Val Ile Thr Arg Lys Lys 370 375 48 325 PRT Rhodobacter capsulatus 48 Met Ala Ile Asp Phe Lys Gln Asp Ile Leu Ala Pro Val Ala Gln Asp 1 5 10 15 Phe Ala Ala Met Asp Gln Phe Ile Asn Glu Gly Ile Ser Ser Lys Val 20 25 30 Ala Leu Val Met Ser Val Ser Lys His Val Val Glu Ala Gly Gly Lys 35 40 45 Arg Met Arg Pro Ile Met Cys Leu Leu Ala Ala Tyr Ala Cys Gly Glu 50 55 60 Thr Asn Leu Lys His Ala Gln Lys Leu Ala Ala Ile Ile Glu Met Leu 65 70 75 80 His Thr Ala Thr Leu Val His Asp Asp Asp Val Asp Glu Ser Gly Leu 85 90 95 Arg Arg Gly Arg Pro Thr Ala Asn Ala Thr Trp Asn Asn Gln Thr Ala 100 105 110 Val Leu Val Gly Asp Phe Leu Ile Ala Arg Ala Phe Asp Leu Leu Val 115 120 125 Asp Leu Asp Asn Met Ile Leu Leu Lys Asp Phe Ser Thr Gly Thr Cys 130 135 140 Glu Ile Ala Glu Gly Glu Val Leu Gln Leu Gln Ala Gln His Gln Pro 145 150 155 160 Asp Thr Thr Glu Asp Ile Tyr Leu Gln Ile Ile His Gly Lys Thr Ser 165 170 175 Arg Leu Phe Glu Leu Ala Thr Glu Gly Ala Ala Ile Leu Ala Gly Lys 180 185 190 Pro Glu Tyr Arg Glu Pro Leu Arg Arg Phe Ala Gly His Phe Gly Asn 195 200 205 Ala Phe Gln Ile Ile Asp Asp Ile Leu Asp Tyr Thr Ser Asp Ala Asp 210 215 220 Thr Leu Gly Lys Asn Ile Gly Asp Asp Leu Met Glu Gly Lys Pro Thr 225 230 235 240 Leu Pro Leu Ile Ala Ala Met Gln Asn Thr Gln Gly Glu Gln Arg Asp 245 250 255 Leu Ile Arg Arg Ser Ile Ala Thr Gly Gly Thr Ser Gln Leu Glu Gln 260 265 270 Val Ile Ala Ile Val Gln Asn Ser Gly Ala Leu Asp Tyr Cys His Lys 275 280 285 Arg Ala Thr Glu Glu Thr Glu Arg Ala Leu Gln Ala Leu Glu Ile Leu 290 295 300 Pro Glu Ser Thr Tyr Arg Gln Ala Leu Val Asn Leu Thr Arg Leu Ala 305 310 315 320 Leu Asp Arg Ile Gln 325 49 325 PRT Rhodobacter capsulatus 49 Met Ala Ile Asp Phe Lys Gln Asp Ile Leu Ala Pro Val Ala Gln Asp 1 5 10 15 Phe Ala Ala Met Asp Gln Phe Ile Asn Glu Gly Ile Ser Ser Lys Val 20 25 30 Ala Leu Val Met Ser Val Ser Lys His Val Val Glu Ala Gly Gly Lys 35 40 45 Arg Met Arg Pro Ile Met Cys Leu Leu Ala Ala Tyr Ala Cys Gly Glu 50 55 60 Thr Asn Leu Lys His Ala Gln Lys Leu Ala Ala Ile Ile Glu Met Leu 65 70 75 80 His Thr Ala Thr Leu Val His Asp Asp Val Val Asp Glu Ser Gly Leu 85 90 95 Arg Arg Gly Arg Pro Thr Ala Asn Ala Thr Trp Asn Asn Gln Thr Ala 100 105 110 Val Leu Val Gly Asp Phe Leu Ile Ala Arg Ala Phe Asp Leu Leu Val 115 120 125 Asp Leu Asp Asn Met Ile Leu Leu Lys Asp Phe Ser Thr Gly Thr Cys 130 135 140 Glu Ile Ala Glu Gly Glu Val Leu Gln Leu Gln Ala Gln His Gln Pro 145 150 155 160 Asp Thr Thr Glu Asp Ile Tyr Leu Gln Ile Ile His Gly Lys Thr Ser 165 170 175 Arg Leu Phe Glu Leu Ala Thr Glu Gly Ala Ala Ile Leu Ala Gly Lys 180 185 190 Pro Glu Tyr Arg Glu Pro Leu Arg Arg Phe Ala Gly His Phe Gly Asn 195 200 205 Ala Phe Gln Ile Ile Asp Asp Ile Leu Asp Tyr Thr Ser Asp Ala Asp 210 215 220 Thr Leu Gly Lys Asn Ile Gly Asp Asp Leu Met Glu Gly Lys Pro Thr 225 230 235 240 Leu Pro Leu Ile Ala Ala Met Gln Asn Thr Gln Gly Glu Gln Arg Asp 245 250 255 Leu Ile Arg Arg Ser Ile Ala Thr Gly Gly Thr Ser Gln Leu Glu Gln 260 265 270 Val Ile Ala Ile Val Gln Asn Ser Gly Ala Leu Asp Tyr Cys His Lys 275 280 285 Arg Ala Thr Glu Glu Thr Glu Arg Ala Leu Gln Ala Leu Glu Ile Leu 290 295 300 Pro Glu Ser Thr Tyr Arg Gln Ala Leu Val Asn Leu Thr Arg Leu Ala 305 310 315 320 Leu Asp Arg Ile Gln 325 50 327 PRT Rickettsia prowazeki 50 Met Asn Ile Ile Val Lys Ile Gln Gln Asn Leu Lys Asp Glu Val Thr 1 5 10 15 Gln Leu Asn Asp Leu Ile Ile Ser Cys Leu Lys Ser Asp Ala Glu Leu 20 25 30 Ile Glu Lys Val Gly Lys Tyr Leu Val Glu Ala Gly Gly Lys Arg Ile 35 40 45 Arg Pro Leu Leu Thr Ile Ile Thr Ala Lys Met Phe Asp Tyr Lys Gly 50 55 60 Asn Asn His Ile Lys Leu Ala Ser Ala Val Glu Phe Ile His Ala Ala 65 70 75 80 Thr Leu Leu His Asp Asp Val Val Asp Asn Ser Thr Leu Arg Arg Phe 85 90 95 Lys Pro Thr Ala Asn Val Ile Trp Gly Ser Lys Thr Ser Ile Leu Val 100 105 110 Gly Asp Phe Leu Phe Ser Gln Ser Phe Lys Leu Met Val Ala Ser Gly 115 120 125 Cys Ile Lys Ala Met Asn Val Leu Ala Lys Ala Ser Val Ile Ile Ser 130 135 140 Glu Gly Glu Val Val Gln Leu Val Lys Leu Asn Glu Arg Arg Ile Ile 145 150 155 160 Thr Ile Asp Glu Tyr Gln Gln Ile Val Lys Ser Lys Thr Ala Glu Leu 165 170 175 Phe Gly Ala Ala Cys Glu Val Gly Ala Ile Ile Ala Glu Gln Val Asp 180 185 190 Arg Val Ser Lys Asp Val Gln Asn Phe Gly Arg Leu Leu Gly Thr Ile 195 200 205 Phe Gln Val Ile Asp Asp Leu Leu Asp Tyr Leu Gly Ser Asp Lys Gln 210 215 220 Val Gly Lys Asn Ile Gly Asp Asp Phe Leu Glu Gly Lys Val Thr Leu 225 230 235 240 Pro Leu Ile Phe Leu Tyr His Lys Leu Glu Gln Asp Lys Gln Leu Trp 245 250 255 Leu Glu Asn Met Leu Lys Ser Asp Lys Arg Thr Lys Asp Asp Phe Val 260 265 270 Lys Ile Arg Asp Leu Met Leu Lys His Ala Ile Tyr Asn Glu Thr Val 275 280 285 Asn Tyr Leu Ser Ser Leu Glu Asn Glu Ala Asn Asn Leu Leu Asn Lys 290 295 300 Ile Pro Val Gln Asn Ile Tyr Lys Tyr Tyr Leu Phe Ser Ile Ile Arg 305 310 315 320 Phe Ile Leu Tyr Arg Ser Tyr 325 51 323 PRT Escherichia coli 51 Met Asn Leu Glu Lys Ile Asn Glu Leu Thr Ala Gln Asp Met Ala Gly 1 5 10 15 Val Asn Ala Ala Ile Leu Glu Gln Leu Asn Ser Asp Val Gln Leu Ile 20 25 30 Asn Gln Leu Gly Tyr Tyr Ile Val Ser Gly Gly Gly Lys Arg Ile Arg 35 40 45 Pro Met Ile Ala Val Leu Ala Ala Arg Ala Val Gly Tyr Glu Gly Asn 50 55 60 Ala His Val Thr Ile Ala Ala Leu Ile Glu Phe Ile His Thr Ala Thr 65 70 75 80 Leu Leu His Asp Asp Val Val Asp Glu Ser Asp Met Arg Arg Gly Lys 85 90 95 Ala Thr Ala Asn Ala Ala Phe Gly Asn Ala Ala Ser Val Leu Val Gly 100 105 110 Asp Phe Ile Tyr Thr Arg Ala Phe Gln Met Met Thr Ser Leu Gly Ser 115 120 125 Leu Lys Val Leu Glu Val Met Ser Glu Ala Val Asn Val Ile Ala Glu 130 135 140 Gly Glu Val Leu Gln Leu Met Asn Val Asn Asp Pro Asp Ile Thr Glu 145 150 155 160 Glu Asn Tyr Met Arg Val Ile Tyr Ser Lys Thr Ala Arg Leu Phe Glu 165 170 175 Ala Ala Ala Gln Cys Ser Gly Ile Leu Ala Gly Cys Thr Pro Glu Glu 180 185 190 Glu Lys Gly Leu Gln Asp Tyr Gly Arg Tyr Leu Gly Thr Ala Phe Gln 195 200 205 Leu Ile Asp Asp Leu Leu Asp Tyr Asn Ala Asp Gly Glu Gln Leu Gly 210 215 220 Lys Asn Val Gly Asp Asp Leu Asn Glu Gly Lys Pro Thr Leu Pro Leu 225 230 235 240 Leu His Ala Met His His Gly Thr Pro Glu Gln Ala Gln Met Ile Arg 245 250 255 Thr Ala Ile Glu Gln Gly Asn Gly Arg His Leu Leu Glu Pro Val Leu 260 265 270 Glu Ala Met Asn Ala Cys Gly Ser Leu Glu Trp Thr Arg Gln Arg Ala 275 280 285 Glu Glu Glu Ala Asp Lys Ala Ile Ala Ala Leu Gln Val Leu Pro Asp 290 295 300 Thr Pro Trp Arg Glu Ala Leu Ile Gly Leu Ala His Ile Ala Val Gln 305 310 315 320 Arg Asp Arg 52 329 PRT Haemophilus influenzae 52 Met Lys Lys Gln Asp Leu Met Ser Ile Asp Glu Ile Gln Lys Leu Ala 1 5 10 15 Asp Pro Asp Met Gln Lys Val Asn Gln Asn Ile Leu Ala Gln Leu Asn 20 25 30 Ser Asp Val Pro Leu Ile Gly Gln Leu Gly Phe Tyr Ile Val Gln Gly 35 40 45 Gly Gly Lys Arg Ile Arg Pro Leu Ile Ala Val Leu Ala Ala Arg Ser 50 55 60 Leu Gly Phe Glu Gly Ser Asn Ser Ile Thr Cys Ala Thr Phe Val Glu 65 70 75 80 Phe Ile His Thr Ala Ser Leu Leu His Asp Asp Val Val Asp Glu Ser 85 90 95 Asp Met Arg Arg Gly Arg Ala Thr Ala Asn Ala Glu Phe Gly Asn Ala 100 105 110 Ala Ser Val Leu Val Gly Asp Phe Ile Tyr Thr Arg Ala Phe Gln Leu 115 120 125 Val Ala Gln Leu Glu Ser Leu Lys Ile Leu Ser Ile Met Ala Asp Ala 130 135 140 Thr Asn Val Leu Ala Glu Gly Glu Val Gln Gln Leu Met Asn Val Asn 145 150 155 160 Asp Pro Glu Thr Ser Glu Ala Asn Tyr Met Arg Val Ile Tyr Ser Lys 165 170 175 Thr Ala Arg Leu Phe Glu Val Ala Gly Gln Ala Ala Ala Ile Val Ala 180 185 190 Gly Gly Thr Glu Ala Gln Glu Lys Ala Leu Gln Asp Tyr Gly Arg Tyr 195 200 205 Leu Gly Thr Ala Phe Gln Leu Val Asp Asp Val Leu Asp Tyr Ser Ala 210 215 220 Asn Thr Gln Ala Leu Gly Lys Asn Val Gly Asp Asp Leu Ala Glu Gly 225 230 235 240 Lys Pro Thr Leu Pro Leu Leu His Ala Met Arg His Gly Asn Ala Gln 245 250 255 Gln Ala Ala Leu Ile Arg Glu Ala Ile Glu Gln Gly Gly Lys Arg Glu 260 265 270 Ala Ile Asp Glu Val Leu Ala Ile Met Thr Glu His Lys Ser Leu Asp 275 280 285 Tyr Ala Met Asn Arg Ala Lys Glu Glu Ala Gln Lys Ala Val Asp Ala 290 295 300 Ile Glu Ile Leu Pro Glu Ser Glu Tyr Lys Gln Ala Leu Ile Ser Leu 305 310 315 320 Ala Tyr Leu Ser Val Asp Arg Asn Tyr 325 53 24 DNA Artificial Sequence Primer 53 rtkattytma aygayaayga aatg 24 54 23 DNA Artificial Sequence Primer 54 tttgaagary tvggywttaa cta 23 55 21 DNA Artificial Sequence Primer 55 rcaycargct tayscvcaya a 21 56 22 DNA Artificial Sequence Primer 56 cgtgytgytc dgcrathgcb ac 22 57 23 DNA Artificial Sequence Primer 57 tgytcdgcra thgcbacrtc raa 23 58 20 DNA Artificial Sequence Primer 58 ggsccdatrt agttaawrcc 20 59 27 DNA Artificial Sequence Primer 59 tcgtgaccaa gaagggcaag ggctatg 27 60 27 DNA Artificial Sequence Primer 60 gacaagtatc acggcgtcca gaagttc 27 61 27 DNA Artificial Sequence Primer 61 atagcccttg cccttcttgg tcacgac 27 62 26 DNA Artificial Sequence Primer 62 cgaacggatc atactcgctc tcgctg 26 63 28 DNA Artificial Sequence Primer 63 tgaggatctt gtgcggatag cattggtg 28 64 26 DNA Artificial Sequence Primer 64 agcggcgtct tgggtaggtc agccat 26 65 30 DNA Artificial Sequence Primer 65 atatggtacc gtgtgactga cctgtccaac 30 66 30 DNA Artificial Sequence Primer 66 agtctctaga atgttggaga ttcaaggtgg 30 67 20 DNA Artificial Sequence Primer 67 ggwgghaarm gmmtkcgycc 20 68 20 DNA Artificial Sequence Primer 68 acwytgstdc atgatgatgt 20 69 20 DNA Artificial Sequence Primer 69 acnytnbtnc aygaygaygt 20 70 20 DNA Artificial Sequence Primer 70 tyrtcyacsa catcatcatg 20 71 23 DNA Artificial Sequence Primer 71 tghavkacyt caccytcrgm aat 23 72 20 DNA Artificial Sequence Primer 72 tartcnarda trtcrtcdat 20 73 20 DNA Artificial Sequence Primer 73 tcrtcnccna ynktyttncc 20 74 26 DNA Artificial Sequence Primer 74 tggaagctgc gggcgaagag atagtc 26 75 26 DNA Artificial Sequence Primer 75 cccaccagca ccgaggattt gttgtc 26 76 27 DNA Artificial Sequence Primer 76 gaacctgctg tgggacaaca aatcctc 27 77 27 DNA Artificial Sequence Primer 77 tcggtgctgg tgggcgacta tctcttc 27 78 30 DNA Artificial Sequence Primer 78 actagaattc cgcaacagtt ccttcatgtc 30 79 30 DNA Artificial Sequence Primer 79 atagaagctt acttgcggtc ggactgatag 30 80 23 DNA Artificial Sequence Primer 80 ctsstscayg aygaygtsgt sga 23 81 19 DNA Artificial Sequence Primer 81 gtsgmvgssg gsggsaarc 19 82 18 DNA Artificial Sequence Primer 82 ctsmtscayg aygaygts 18 83 20 DNA Artificial Sequence Primer 83 dssrtbctsg tsggsgaytt 20 84 21 DNA Artificial Sequence Primer 84 vakraartcs ccsacsagsa c 21 85 18 DNA Artificial Sequence Primer 85 sacytcsccy tcsgcrat 18 86 21 DNA Artificial Sequence Primer 86 rtcrtcsccv ayvktyttsc c 21 87 21 DNA Artificial Sequence Primer 87 sggsagsgtv rbyttsccyt c 21 88 26 DNA Artificial Sequence Primer 88 gtgctggtcg gcgacttcct gttcag 26 89 27 DNA Artificial Sequence Primer 89 atcgacctgt ccgaggatcg ctatctc 27 90 26 DNA Artificial Sequence Primer 90 tcgaacgagc ggctgaacag gaagtc 26 91 26 DNA Artificial Sequence Primer 91 tggcgggatt gccccagatg atgttg 26 92 30 DNA Artificial Sequence Primer 92 attaggtacc atcagataat cgtcgctcaa 30 93 30 DNA Artificial Sequence Primer 93 tataggatcc gacatggacg aggaagacgc 30 94 20 DNA Artificial Sequence Primer 94 cgatggtgac cacgaagaag 20 95 2017 DNA Sphingomonas trueperi 95 ggcccgggct ggtggggttt ctggcgctgg ggctggtgtt cggcgcgttc ttcttcgtcg 60 cgatcgtgac gcggaacgcc aagctggcgg cggggcaggt ctatgtcggg ctgccggtgc 120 tcgcgctgct gctgctccgc gaccatccgc agggctttgc cgcgacgctg tggacgatgg 180 cgatcgtctg ggtgtgcgac agcggcgcct attttgccgg tcgcgcgatc ggtgggccca 240 agctcgcgcc ctcgatcagc ccgaacaaga cctgggcggg gctgatcggc gggttggttg 300 ccgcgatcct gttctccgcc ggctatgtcg cgctggcgcc ggggagcgcg atcggctggt 360 ggctggtcgc ggtgtcgccg ctggtagcct tcgcctcgca gatcggcgac ctgtacgaga 420 gccatctcaa gcgggtcgcg ggcgtgaagg attcgagcaa cctgctgccc ggccatggcg 480 gcattctcga ccggctcgac ggccttgtct tcgcagcccc ggttgcagct ttgttttttg 540 cgatccatca tcaggtggtc gtgggaggat actggtggtg aagcgcgtca cggtgttggg 600 ggcgaccggc tcggtcggca cctcgacgct ggatctgatc gaacgaaatc cgcacgcctt 660 cgaagtcgtg gcgctgaccg caaattgcga tgtcgagaag ctggctgccg cggcgatccg 720 cacgcgcgcg cgctgcgccg tggtcgccga cgagaaatgc ctgccggcgc tacaggagcg 780 gctggccggc agcggtgtcg aggcgatggg cggggcgcat tcggtgtgcg acgtggcgcg 840 gatgggtgct gactggacga tggctgcgat cgtcggcagc gcagggctca agccggtgat 900 ggccgcgctg gaggccggtg gcaccgtcgc gctcgcgaac aaggagtcgc tcgtctcggc 960 gggtgaggtg atgatggcgg cggcccgcgc gcatggcgcg acgctgctgc cggtcgattc 1020 ggagcacaat gcggtgttcc agtgcctcga tcgcaccgcg cccaggggcg tccgccggat 1080 catccttacc gccagcggtg gtccgttccg cgcgacgccg aaggaagcga tgcgcgacat 1140 cacccccgca caggcggtgg cgcatcccaa ctggtcgatg ggcgccaaga tctcggtcga 1200 ctccgcgacg atgatgaaca aggggctcga actgatcgaa gccttccacc tgttcccggt 1260 cgccgccgag caactggccg tgctggtcca tcgccaatcc gtcgtccatt cgatggtgga 1320 atatgtcgac ggatcggtgc tggcccagct cggcacgccc gacatgcgca cgccgatcgc 1380 ctatgcgctg gcttggcccg agcggatgga gacgctgtgc ccgccgctcg accttgccac 1440 ggtgggtaag ctcgagttcg aaaatcccga tctcgatcgc ttcccggcgc tcgcgctggc 1500 gatggaggca ttgaaggcgg gcggggcgcg tccggccatt ctcaatgccg ccaacgaagt 1560 cgccgtcgcg gcctttctcg ccgggcggat cggattcctt gaaattgccg caatctctgc 1620 cgatacgctg tctcgctatg acccggccgc gccggaaacg ctcgatgccg tgctggcgat 1680 cgacgcggag gcgcggcttt acgcggctga gcgagtgaag gactgcgtcg cttgatccaa 1740 tcccccggca tcctgctcac cattctggcg ttcgcgctgg tgatcgggcc gctcgtgttc 1800 ctgcacgagc tgggacatta tctggcgggc cgcctcttcg gggtgaaggc cgaggaattc 1860 tcgatcggct tcggccgcga gatcgccggc accaccgatc gccgcggcac gcgctggaag 1920 ttcagcctgt tgccgctggg cggctatgtc cgcttcgccg gcgacatgaa cccggcgagc 1980 cagccttcgc ccgaatggct gcagaccagc ccgggcc 2017 96 1161 DNA Sphingomonas trueperi 96 gtggtgaagc gcgtcacggt gttgggggcg accggctcgg tcggcacctc gacgctggat 60 ctgatcgaac gaaatccgca cgccttcgaa gtcgtggcgc tgaccgcaaa ttgcgatgtc 120 gagaagctgg ctgccgcggc gatccgcacg cgcgcgcgct gcgccgtggt cgccgacgag 180 aaatgcctgc cggcgctaca ggagcggctg gccggcagcg gtgtcgaggc gatgggcggg 240 gcgcattcgg tgtgcgacgt ggcgcggatg ggtgctgact ggacgatggc tgcgatcgtc 300 ggcagcgcag ggctcaagcc ggtgatggcc gcgctggagg ccggtggcac cgtcgcgctc 360 gcgaacaagg agtcgctcgt ctcggcgggt gaggtgatga tggcggcggc ccgcgcgcat 420 ggcgcgacgc tgctgccggt cgattcggag cacaatgcgg tgttccagtg cctcgatcgc 480 accgcgccca ggggcgtccg ccggatcatc cttaccgcca gcggtggtcc gttccgcgcg 540 acgccgaagg aagcgatgcg cgacatcacc cccgcacagg cggtggcgca tcccaactgg 600 tcgatgggcg ccaagatctc ggtcgactcc gcgacgatga tgaacaaggg gctcgaactg 660 atcgaagcct tccacctgtt cccggtcgcc gccgagcaac tggccgtgct ggtccatcgc 720 caatccgtcg tccattcgat ggtggaatat gtcgacggat cggtgctggc ccagctcggc 780 acgcccgaca tgcgcacgcc gatcgcctat gcgctggctt ggcccgagcg gatggagacg 840 ctgtgcccgc cgctcgacct tgccacggtg ggtaagctcg agttcgaaaa tcccgatctc 900 gatcgcttcc cggcgctcgc gctggcgatg gaggcattga aggcgggcgg ggcgcgtccg 960 gccattctca atgccgccaa cgaagtcgcc gtcgcggcct ttctcgccgg gcggatcgga 1020 ttccttgaaa ttgccgcaat ctctgccgat acgctgtctc gctatgaccc ggccgcgccg 1080 gaaacgctcg atgccgtgct ggcgatcgac gcggaggcgc ggctttacgc ggctgagcga 1140 gtgaaggact gcgtcgcttg a 1161 97 386 PRT Sphingomonas trueperi 97 Val Val Lys Arg Val Thr Val Leu Gly Ala Thr Gly Ser Val Gly Thr 1 5 10 15 Ser Thr Leu Asp Leu Ile Glu Arg Asn Pro His Ala Phe Glu Val Val 20 25 30 Ala Leu Thr Ala Asn Cys Asp Val Glu Lys Leu Ala Ala Ala Ala Ile 35 40 45 Arg Thr Arg Ala Arg Cys Ala Val Val Ala Asp Glu Lys Cys Leu Pro 50 55 60 Ala Leu Gln Glu Arg Leu Ala Gly Ser Gly Val Glu Ala Met Gly Gly 65 70 75 80 Ala His Ser Val Cys Asp Val Ala Arg Met Gly Ala Asp Trp Thr Met 85 90 95 Ala Ala Ile Val Gly Ser Ala Gly Leu Lys Pro Val Met Ala Ala Leu 100 105 110 Glu Ala Gly Gly Thr Val Ala Leu Ala Asn Lys Glu Ser Leu Val Ser 115 120 125 Ala Gly Glu Val Met Met Ala Ala Ala Arg Ala His Gly Ala Thr Leu 130 135 140 Leu Pro Val Asp Ser Glu His Asn Ala Val Phe Gln Cys Leu Asp Arg 145 150 155 160 Thr Ala Pro Arg Gly Val Arg Arg Ile Ile Leu Thr Ala Ser Gly Gly 165 170 175 Pro Phe Arg Ala Thr Pro Lys Glu Ala Met Arg Asp Ile Thr Pro Ala 180 185 190 Gln Ala Val Ala His Pro Asn Trp Ser Met Gly Ala Lys Ile Ser Val 195 200 205 Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Leu Ile Glu Ala Phe 210 215 220 His Leu Phe Pro Val Ala Ala Glu Gln Leu Ala Val Leu Val His Arg 225 230 235 240 Gln Ser Val Val His Ser Met Val Glu Tyr Val Asp Gly Ser Val Leu 245 250 255 Ala Gln Leu Gly Thr Pro Asp Met Arg Thr Pro Ile Ala Tyr Ala Leu 260 265 270 Ala Trp Pro Glu Arg Met Glu Thr Leu Cys Pro Pro Leu Asp Leu Ala 275 280 285 Thr Val Gly Lys Leu Glu Phe Glu Asn Pro Asp Leu Asp Arg Phe Pro 290 295 300 Ala Leu Ala Leu Ala Met Glu Ala Leu Lys Ala Gly Gly Ala Arg Pro 305 310 315 320 Ala Ile Leu Asn Ala Ala Asn Glu Val Ala Val Ala Ala Phe Leu Ala 325 330 335 Gly Arg Ile Gly Phe Leu Glu Ile Ala Ala Ile Ser Ala Asp Thr Leu 340 345 350 Ser Arg Tyr Asp Pro Ala Ala Pro Glu Thr Leu Asp Ala Val Leu Ala 355 360 365 Ile Asp Ala Glu Ala Arg Leu Tyr Ala Ala Glu Arg Val Lys Asp Cys 370 375 380 Val Ala 385 98 388 PRT Bacillus subtilis 98 Met Lys Asn Ile Cys Leu Leu Gly Ala Thr Gly Ser Ile Gly Glu Gln 1 5 10 15 Thr Leu Asp Val Leu Arg Ala His Gln Asp Gln Phe Gln Leu Val Ser 20 25 30 Met Ser Phe Gly Arg Asn Ile Asp Lys Ala Val Pro Met Ile Glu Val 35 40 45 Phe Gln Pro Lys Phe Val Ser Val Gly Asp Leu Asp Thr Tyr His Lys 50 55 60 Leu Lys Gln Met Ser Phe Ser Phe Glu Cys Gln Ile Gly Leu Gly Glu 65 70 75 80 Glu Gly Leu Ile Glu Ala Ala Val Met Glu Glu Val Asp Ile Val Val 85 90 95 Asn Ala Leu Leu Gly Ser Val Gly Leu Ile Pro Thr Leu Lys Ala Ile 100 105 110 Glu Gln Lys Lys Thr Ile Ala Leu Ala Asn Lys Glu Thr Leu Val Thr 115 120 125 Ala Gly His Ile Val Lys Glu His Ala Lys Lys Tyr Asp Val Pro Leu 130 135 140 Leu Pro Val Asp Ser Glu His Ser Ala Ile Phe Gln Ala Leu Gln Gly 145 150 155 160 Glu Gln Ala Lys Asn Ile Glu Arg Leu Ile Ile Thr Ala Ser Gly Gly 165 170 175 Ser Phe Arg Asp Lys Thr Arg Glu Glu Leu Glu Ser Val Thr Val Glu 180 185 190 Asp Ala Leu Lys His Pro Asn Trp Ser Met Gly Ala Lys Ile Thr Ile 195 200 205 Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Val Ile Glu Ala His 210 215 220 Trp Leu Phe Asp Ile Pro Tyr Glu Gln Ile Asp Val Val Leu His Lys 225 230 235 240 Glu Ser Ile Ile His Ser Met Val Glu Phe His Asp Lys Ser Val Ile 245 250 255 Ala Gln Leu Gly Thr Pro Asp Met Arg Val Pro Ile Gln Tyr Ala Leu 260 265 270 Thr Tyr Pro Asp Arg Leu Pro Leu Pro Asp Ala Lys Arg Leu Glu Leu 275 280 285 Trp Glu Ile Gly Ser Leu His Phe Glu Lys Ala Asp Phe Asp Arg Phe 290 295 300 Arg Cys Leu Gln Phe Ala Phe Glu Ser Gly Lys Ile Gly Gly Thr Met 305 310 315 320 Pro Thr Val Leu Asn Ala Ala Asn Glu Val Ala Val Ala Ala Phe Leu 325 330 335 Ala Gly Lys Ile Pro Phe Leu Ala Ile Glu Asp Cys Ile Glu Lys Ala 340 345 350 Leu Thr Arg His Gln Leu Leu Lys Lys Pro Ser Trp Arg Thr Phe Lys 355 360 365 Lys Trp Thr Lys Ile Pro Gly Asp Thr Ser Ile Gln Tyr Ser His Lys 370 375 380 Val Val Cys Ser 385 99 397 PRT Haemophilus influenzae 99 Met Gln Lys Gln Asn Ile Val Ile Leu Gly Ser Thr Gly Ser Ile Gly 1 5 10 15 Lys Ser Thr Leu Ser Val Ile Glu Asn Asn Pro Gln Lys Tyr His Ala 20 25 30 Phe Ala Leu Val Gly Gly Lys Asn Val Glu Ala Met Phe Glu Gln Cys 35 40 45 Ile Lys Phe Arg Pro His Phe Ala Ala Leu Asp Asp Val Asn Ala Ala 50 55 60 Lys Ile Leu Arg Glu Lys Leu Ile Ala His His Ile Pro Thr Glu Val 65 70 75 80 Leu Ala Gly Arg Arg Ala Ile Cys Glu Leu Ala Ala His Pro Asp Ala 85 90 95 Asp Gln Ile Met Ala Ser Ile Val Gly Ala Ala Gly Leu Leu Pro Thr 100 105 110 Leu Ser Ala Val Lys Ala Gly Lys Arg Val Leu Leu Ala Asn Lys Glu 115 120 125 Ser Leu Val Thr Cys Gly Gln Leu Phe Ile Asp Ala Val Lys Asn Tyr 130 135 140 Gly Ser Lys Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln 145 150 155 160 Ser Leu Pro Pro Glu Ala Gln Glu Lys Ile Gly Phe Cys Pro Leu Ser 165 170 175 Glu Leu Gly Val Ser Lys Ile Ile Leu Thr Gly Ser Gly Gly Pro Phe 180 185 190 Arg Tyr Thr Pro Leu Glu Gln Phe Thr Asn Ile Thr Pro Glu Gln Ala 195 200 205 Val Ala His Pro Asn Trp Ser Met Gly Lys Lys Ile Ser Val Asp Ser 210 215 220 Ala Thr Met Met Asn Lys Gly Leu Glu Tyr Ile Glu Ala Arg Trp Leu 225 230 235 240 Phe Asn Ala Ser Ala Glu Glu Met Glu Val Ile Ile His Pro Gln Ser 245 250 255 Ile Ile His Ser Met Val Arg Tyr Val Asp Gly Ser Val Ile Thr Gln 260 265 270 Met Gly Asn Pro Asp Met Arg Thr Pro Ile Ala Glu Thr Met Ala Tyr 275 280 285 Pro His Arg Thr Phe Ala Gly Val Glu Pro Leu Asp Phe Phe Lys Ile 290 295 300 Lys Glu Leu Thr Phe Ile Glu Pro Asp Phe Asn Arg Tyr Pro Asn Leu 305 310 315 320 Lys Leu Ala Ile Asp Ala Phe Ala Ala Gly Gln Tyr Ala Thr Thr Ala 325 330 335 Met Asn Ala Ala Asn Glu Ile Ala Val Gln Ala Phe Leu Asp Arg Gln 340 345 350 Ile Gly Phe Met Asp Ile Ala Lys Ile Asn Ser Lys Thr Ile Glu Arg 355 360 365 Ile Ser Pro Tyr Thr Ile Gln Asn Ile Asp Asp Val Leu Glu Ile Asp 370 375 380 Ala Gln Ala Arg Glu Ile Ala Lys Thr Leu Leu Arg Glu 385 390 395 100 398 PRT Escherichia coli 100 Met Lys Gln Leu Thr Ile Leu Gly Ser Thr Gly Ser Ile Gly Cys Ser 1 5 10 15 Thr Leu Asp Val Val Arg His Asn Pro Glu His Phe Arg Val Val Ala 20 25 30 Leu Val Ala Gly Lys Asn Val Thr Arg Met Val Glu Gln Cys Leu Glu 35 40 45 Phe Ser Pro Arg Tyr Ala Val Met Asp Asp Glu Ala Ser Ala Lys Leu 50 55 60 Leu Lys Thr Met Leu Gln Gln Gln Gly Ser Arg Thr Glu Val Leu Ser 65 70 75 80 Gly Gln Gln Ala Ala Cys Asp Met Ala Ala Leu Glu Asp Val Asp Gln 85 90 95 Val Met Ala Ala Ile Val Gly Ala Ala Gly Leu Leu Pro Thr Leu Ala 100 105 110 Ala Ile Arg Ala Gly Lys Thr Ile Leu Leu Ala Asn Lys Glu Ser Leu 115 120 125 Val Thr Cys Gly Arg Leu Phe Met Asp Ala Val Lys Gln Ser Lys Ala 130 135 140 Gln Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln Ser Leu 145 150 155 160 Pro Gln Pro Ile Gln His Asn Leu Gly Tyr Ala Asp Leu Glu Gln Asn 165 170 175 Gly Val Val Ser Ile Leu Leu Thr Gly Ser Gly Gly Pro Phe Arg Glu 180 185 190 Thr Pro Leu Arg Asp Leu Ala Thr Met Thr Pro Asp Gln Ala Cys Arg 195 200 205 His Pro Asn Trp Ser Met Gly Arg Lys Ile Ser Val Asp Ser Ala Thr 210 215 220 Met Met Asn Lys Gly Leu Glu Tyr Ile Glu Ala Arg Trp Leu Phe Asn 225 230 235 240 Ala Ser Ala Ser Gln Met Glu Val Leu Ile His Pro Gln Ser Val Ile 245 250 255 His Ser Met Val Arg Tyr Gln Asp Gly Ser Val Leu Ala Gln Leu Gly 260 265 270 Glu Pro Asp Met Arg Thr Pro Ile Ala His Thr Met Ala Trp Pro Asn 275 280 285 Arg Val Asn Ser Gly Val Lys Pro Leu Asp Phe Cys Lys Leu Ser Ala 290 295 300 Leu Thr Phe Ala Ala Pro Asp Tyr Asp Arg Tyr Pro Cys Leu Lys Leu 305 310 315 320 Ala Met Glu Ala Phe Glu Gln Gly Gln Ala Ala Thr Thr Ala Leu Asn 325 330 335 Ala Ala Asn Glu Ile Thr Val Ala Ala Phe Leu Ala Gln Gln Ile Arg 340 345 350 Phe Thr Asp Ile Ala Ala Leu Asn Leu Ser Val Leu Glu Lys Met Asp 355 360 365 Met Arg Glu Pro Gln Cys Val Asp Asp Val Leu Ser Val Asp Ala Asn 370 375 380 Ala Arg Glu Val Ala Arg Lys Glu Val Met Arg Leu Ala Ser 385 390 395 101 388 PRT Zymonas mobilis 101 Met Ser Gln Pro Arg Thr Val Thr Val Leu Gly Ala Thr Gly Ser Ile 1 5 10 15 Gly His Ser Thr Leu Asp Leu Ile Glu Arg Asn Leu Asp Arg Tyr Gln 20 25 30 Val Ile Ala Leu Thr Ala Asn Arg Asn Val Lys Asp Leu Ala Asp Ala 35 40 45 Ala Lys Arg Thr Asn Ala Lys Arg Ala Val Ile Ala Asp Pro Ser Leu 50 55 60 Tyr Asn Asp Leu Lys Glu Ala Leu Ala Gly Ser Ser Val Glu Ala Ala 65 70 75 80 Ala Gly Ala Asp Ala Leu Val Glu Ala Ala Met Met Gly Ala Asp Trp 85 90 95 Thr Met Ala Ala Ile Ile Gly Cys Ala Gly Leu Lys Ala Thr Leu Ala 100 105 110 Ala Ile Arg Lys Gly Lys Thr Val Ala Leu Ala Asn Lys Glu Ser Leu 115 120 125 Val Ser Ala Gly Gly Leu Met Ile Asp Ala Val Arg Glu His Gly Thr 130 135 140 Thr Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln Cys Phe 145 150 155 160 Pro His His Asn Arg Asp Tyr Val Arg Arg Ile Ile Ile Thr Ala Ser 165 170 175 Gly Gly Pro Phe Arg Thr Thr Ser Leu Ala Glu Met Ala Thr Val Thr 180 185 190 Pro Glu Arg Ala Val Gln His Pro Asn Trp Ser Met Gly Ala Lys Ile 195 200 205 Ser Ile Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Leu Ile Glu 210 215 220 Ala Tyr His Leu Phe Gln Ile Pro Leu Glu Lys Phe Glu Ile Leu Val 225 230 235 240 His Pro Gln Ser Val Ile His Ser Met Val Glu Tyr Leu Asp Gly Ser 245 250 255 Ile Leu Ala Gln Ile Gly Ser Pro Asp Met Arg Thr Pro Ile Gly His 260 265 270 Thr Leu Ala Trp Pro Lys Arg Met Glu Thr Pro Ala Glu Ser Leu Asp 275 280 285 Phe Thr Lys Leu Arg Gln Met Asp Phe Glu Ala Pro Asp Tyr Glu Arg 290 295 300 Phe Pro Ala Leu Thr Leu Ala Met Glu Ser Ile Lys Ser Gly Gly Ala 305 310 315 320 Arg Pro Ala Val Met Asn Ala Ala Asn Glu Ile Ala Val Ala Ala Phe 325 330 335 Leu Asp Lys Lys Ile Gly Phe Leu Asp Ile Ala Lys Ile Val Glu Lys 340 345 350 Thr Leu Asp His Tyr Thr Pro Ala Thr Pro Ser Ser Leu Glu Asp Val 355 360 365 Phe Ala Ile Asp Asn Glu Ala Arg Ile Gln Ala Ala Ala Leu Met Glu 370 375 380 Ser Leu Pro Ala 385 102 402 PRT Synechococcus leopoliensis 102 Met Lys Ala Val Thr Leu Leu Gly Ser Thr Gly Ser Ile Gly Thr Gln 1 5 10 15 Thr Leu Asp Ile Leu Glu Gln Tyr Pro Asp Arg Phe Arg Leu Val Gly 20 25 30 Leu Ala Ala Gly Arg Asn Val Ala Leu Leu Ser Glu Gln Ile Arg Arg 35 40 45 His Arg Pro Glu Ile Val Ala Ile Gln Asp Ala Ala Gln Leu Ser Glu 50 55 60 Leu Gln Ala Ala Ile Ala Asp Leu Asp Asn Pro Pro Leu Ile Leu Thr 65 70 75 80 Gly Glu Ala Gly Val Thr Glu Val Ala Arg Tyr Gly Asp Ala Glu Ile 85 90 95 Val Val Thr Gly Ile Val Gly Cys Ala Gly Leu Leu Pro Thr Ile Ala 100 105 110 Ala Ile Glu Ala Gly Lys Asp Ile Ala Leu Ala Asn Lys Glu Thr Leu 115 120 125 Ile Ala Ala Gly Pro Val Val Leu Pro Leu Leu Gln Lys His Gly Val 130 135 140 Thr Ile Thr Pro Ala Asp Ser Glu His Ser Ala Ile Phe Gln Cys Ile 145 150 155 160 Gln Gly Leu Ser Thr His Ala Asp Phe Arg Pro Ala Gln Val Val Ala 165 170 175 Gly Leu Arg Arg Ile Leu Leu Thr Ala Ser Gly Gly Ala Phe Arg Asp 180 185 190 Trp Pro Val Glu Arg Leu Ser Gln Val Thr Val Ala Asp Ala Leu Lys 195 200 205 His Pro Asn Trp Ser Met Gly Arg Lys Ile Thr Val Asp Ser Ala Thr 210 215 220 Leu Met Asn Lys Gly Leu Glu Val Ile Glu Ala His Tyr Leu Phe Gly 225 230 235 240 Leu Asp Tyr Asp Tyr Ile Asp Ile Val Ile His Pro Gln Ser Ile Ile 245 250 255 His Ser Leu Ile Glu Leu Glu Asp Thr Ser Val Leu Ala Gln Leu Gly 260 265 270 Trp Pro Asp Met Arg Leu Pro Leu Leu Tyr Ala Leu Ser Trp Pro Asp 275 280 285 Arg Leu Ser Thr Gln Trp Ser Ala Leu Asp Leu Val Lys Ala Gly Ser 290 295 300 Leu Glu Phe Arg Glu Pro Asp His Ala Lys Tyr Pro Cys Met Asp Leu 305 310 315 320 Ala Tyr Ala Ala Gly Arg Lys Gly Gly Thr Met Pro Ala Val Leu Asn 325 330 335 Ala Ala Asn Glu Gln Ala Val Ala Leu Phe Leu Glu Glu Gln Ile His 340 345 350 Phe Ser Asp Ile Pro Arg Leu Ile Glu Arg Ala Cys Asp Arg His Gln 355 360 365 Thr Glu Trp Gln Gln Gln Pro Ser Leu Asp Asp Ile Leu Ala Tyr Asp 370 375 380 Ala Trp Ala Arg Gln Phe Val Gln Ala Ser Tyr Gln Ser Leu Glu Ser 385 390 395 400 Val Val 103 394 PRT Synechocystis sp. PCC 6803 103 Met Val Lys Arg Ile Ser Ile Leu Gly Ser Thr Gly Ser Ile Gly Thr 1 5 10 15 Gln Thr Leu Asp Ile Val Thr His His Pro Asp Ala Phe Gln Val Val 20 25 30 Gly Leu Ala Ala Gly Gly Asn Val Ala Leu Leu Ala Gln Gln Val Ala 35 40 45 Glu Phe Arg Pro Glu Ile Val Ala Ile Arg Gln Ala Glu Lys Leu Glu 50 55 60 Asp Leu Lys Ala Ala Val Ala Glu Leu Thr Asp Tyr Gln Pro Met Tyr 65 70 75 80 Val Val Gly Glu Glu Gly Val Val Glu Val Ala Arg Tyr Gly Asp Ala 85 90 95 Glu Ser Val Val Thr Gly Ile Val Gly Cys Ala Gly Leu Leu Pro Thr 100 105 110 Met Ala Ala Ile Ala Ala Gly Lys Asp Ile Ala Leu Ala Asn Lys Glu 115 120 125 Thr Leu Ile Ala Gly Ala Pro Val Val Leu Pro Leu Val Glu Lys Met 130 135 140 Gly Val Lys Leu Leu Pro Ala Asp Ser Glu His Ser Ala Ile Phe Gln 145 150 155 160 Cys Leu Gln Gly Val Pro Glu Gly Gly Leu Arg Arg Ile Ile Leu Thr 165 170 175 Ala Ser Gly Gly Ala Phe Arg Asp Leu Pro Val Glu Arg Leu Pro Phe 180 185 190 Val Thr Val Gln Asp Ala Leu Lys His Pro Asn Trp Ser Met Gly Gln 195 200 205 Lys Ile Thr Ile Asp Ser Ala Thr Leu Met Asn Lys Gly Leu Glu Val 210 215 220 Ile Glu Ala His Tyr Leu Phe Gly Leu Asp Tyr Asp His Ile Asp Ile 225 230 235 240 Val Ile His Pro Gln Ser Ile Ile His Ser Leu Ile Glu Val Gln Asp 245 250 255 Thr Ser Val Leu Ala Gln Leu Gly Trp Pro Asp Met Arg Leu Pro Leu 260 265 270 Leu Tyr Ala Leu Ser Trp Pro Glu Arg Ile Tyr Thr Asp Trp Glu Pro 275 280 285 Leu Asp Leu Val Lys Ala Gly Ser Leu Ser Phe Arg Glu Pro Asp His 290 295 300 Asp Lys Tyr Pro Cys Met Gln Leu Ala Tyr Gly Ala Gly Arg Ala Gly 305 310 315 320 Gly Ala Met Pro Ala Val Leu Asn Ala Ala Asn Glu Gln Ala Val Ala 325 330 335 Leu Phe Leu Gln Glu Lys Ile Ser Phe Leu Asp Ile Pro Arg Leu Ile 340 345 350 Glu Lys Thr Cys Asp Leu Tyr Val Gly Gln Asn Thr Ala Ser Pro Asp 355 360 365 Leu Glu Thr Ile Leu Ala Ala Asp Gln Trp Ala Arg Arg Thr Val Leu 370 375 380 Glu Asn Ser Ala Cys Val Ala Thr Arg Pro 385 390 104 436 PRT Mycobacterium tuberculosis 104 Met Ala Thr Gly Gly Arg Val Val Ile Arg Arg Arg Gly Asp Asn Glu 1 5 10 15 Val Val Ala His Asn Asp Glu Val Thr Asn Ser Thr Asp Gly Arg Ala 20 25 30 Asp Gly Arg Leu Arg Val Val Val Leu Gly Ser Thr Gly Ser Ile Gly 35 40 45 Thr Gln Ala Leu Gln Val Ile Ala Asp Asn Pro Asp Arg Phe Glu Val 50 55 60 Val Gly Leu Ala Ala Gly Gly Ala His Leu Asp Thr Leu Leu Arg Gln 65 70 75 80 Arg Ala Gln Thr Gly Val Thr Asn Ile Ala Val Ala Asp Glu His Ala 85 90 95 Ala Gln Arg Val Gly Asp Ile Pro Tyr His Gly Ser Asp Ala Ala Thr 100 105 110 Arg Leu Val Glu Gln Thr Glu Ala Asp Val Val Leu Asn Ala Leu Val 115 120 125 Gly Ala Leu Gly Leu Arg Pro Thr Leu Ala Ala Leu Lys Thr Gly Ala 130 135 140 Arg Leu Ala Leu Ala Asn Lys Glu Ser Leu Val Ala Gly Gly Ser Leu 145 150 155 160 Val Leu Arg Ala Ala Arg Pro Gly Gln Ile Val Pro Val Asp Ser Glu 165 170 175 His Ser Ala Leu Ala Gln Cys Leu Arg Gly Gly Thr Pro Asp Glu Val 180 185 190 Ala Lys Leu Val Leu Thr Ala Ser Gly Gly Pro Phe Arg Gly Trp Ser 195 200 205 Ala Ala Asp Leu Glu His Val Thr Pro Glu Gln Ala Gly Ala His Pro 210 215 220 Thr Trp Ser Met Gly Pro Met Asn Thr Leu Asn Ser Ala Ser Leu Val 225 230 235 240 Asn Lys Gly Leu Glu Val Ile Glu Thr His Leu Leu Phe Gly Ile Pro 245 250 255 Tyr Asp Arg Ile Asp Val Val Val His Pro Gln Ser Ile Ile His Ser 260 265 270 Met Val Thr Phe Ile Asp Gly Ser Thr Ile Ala Gln Ala Ser Pro Pro 275 280 285 Asp Met Lys Leu Pro Ile Ser Leu Ala Leu Gly Trp Pro Arg Arg Val 290 295 300 Ser Gly Ala Ala Ala Ala Cys Asp Phe His Thr Ala Ser Ser Trp Glu 305 310 315 320 Phe Glu Pro Leu Asp Thr Asp Val Phe Pro Ala Val Glu Leu Ala Arg 325 330 335 Gln Ala Gly Val Ala Gly Gly Cys Met Thr Ala Val Tyr Asn Ala Ala 340 345 350 Asn Glu Glu Ala Ala Ala Ala Phe Leu Ala Gly Arg Ile Gly Phe Pro 355 360 365 Ala Ile Val Gly Ile Ile Ala Asp Val Leu His Ala Ala Asp Gln Trp 370 375 380 Ala Val Glu Pro Ala Thr Val Asp Asp Val Leu Asp Ala Gln Arg Trp 385 390 395 400 Ala Arg Glu Arg Ala Gln Arg Ala Val Ser Gly Met Ala Ser Val Ala 405 410 415 Ile Ala Ser Thr Ala Lys Pro Gly Ala Ala Gly Arg His Ala Ser Thr 420 425 430 Leu Glu Arg Ser 435 105 1191 DNA Pseudomonas aeruginosa 105 atgagtcgac cgcagcggat cagcgtgctc ggcgcgaccg gctcgatcgg cctgagcacc 60 ctggacgtcg tccagcgtca tcccgatcgt tacgaagcct tcgccctgac tggcttcagc 120 cgcctggccg aactcgaggc gctgtgcctc aggcaccgcc ccgtctatgc ggtggtgccg 180 gagcaggccg cggcgattgc cttgcagggc tcgctcgccg cggcgggtat ccgcacccgg 240 gtgctgttcg gcgagcaggc gttgtgcgaa gtggccagcg cgcccgaagt ggacatggta 300 atggcggcca tcgtcggcgc cgccgggctg ccgtcgaccc tggcggccgt cgaggccggc 360 aagcgcgtac tgctggccaa caaggaggcg ctggtgatgt ccggcgcgct gttcatgcag 420 gcggtcaagc gcagcggcgc ggtgctcctg ccgatcgaca gcgagcacaa cgcgatcttc 480 cagtcgctgc cgcgcaatta tgccgatggc ctggagcggg tcggcgtgcg ccggatcctc 540 ttgaccgcct ccggcggccc gttccgcgag acgccgctgg agcaactcgc ttcggtgacg 600 ccggagcagg cttgtgcgca cccgaactgg tcgatggggc gtaagatttc cgtcgactcc 660 gccagcatga tgaacaaggg gctcgaactg atcgaggcgt gctggctgtt cgacgcccag 720 ccgagccagg tcgaggtggt gatccacccg cagagcgtga tccactcgat ggtggactac 780 gtcgacggtt cggtgatcgc ccagctcggc aatccggaca tgcgcacgcc gatttcctat 840 gccatggcct ggccggagcg aatcgattcc ggcgtttcgc cgctggatat gttcgccgtc 900 ggtcgcctgg atttccagcg ccccgacgag cagcgcttcc cctgcctgcg cctggcgagc 960 caggccgcgg aaaccggcgg cagcgccccg gccatgctga atgccgcgaa cgaggtggcc 1020 gtggccgcat ttctcgagcg gcacatccgc ttcagcgaca tcgcggttat catcgaggac 1080 gtgctgaacc gcgaggcggt gaccgcagtc gaatcgctcg atcaggtcct ggctgccgat 1140 cgccgcgcgc gttcggtcgc cgggcaatgg ttgacccggc acgccggcta g 1191 106 1167 DNA Zygomonas mobilis 106 atgagtcagc caagaacagt cactgtttta ggggcgaccg gatccattgg tcattcaaca 60 ctggatttaa tcgaacggaa tttagatcgg tatcaggtca tcgctttgac cgccaaccgc 120 aatgtcaaag atctggccga tgcggcgaaa agaacgaatg ccaagcgggc ggttatcgct 180 gacccgtcgc tttataatga tctgaaagag gctttggccg gaagctctgt tgaggcagcc 240 gcgggtgctg atgccttggt cgaagccgcc atgatgggtg ccgattggac aatggcagcc 300 attatcggtt gcgccggtct aaaagcgacg cttgcagcta ttcgcaaggg caaaacggtc 360 gctttagcga ataaggaatc cttagtttca gctggcggat tgatgatcga tgccgtgcgg 420 gaacatggca cgacgcttct ccccgtcgat tccgagcata acgctatttt ccaatgcttc 480 ccgcatcata accgcgacta tgttcgccgg attattatta cggccagcgg aggtcccttc 540 agaacaacgt ctcttgccga aatggcaacg gtcacgccag aacgcgcggt tcagcatccc 600 aactggtcaa tgggtgccaa gatttctatc gattctgcta caatgatgaa taaggggctt 660 gaattgatag aagcctatca tctcttccag attccattag aaaaatttga aattttggtt 720 catcctcagt cagttattca ctccatggtg gaatatttgg atggttctat ccttgcccag 780 atcggtagtc ctgatatgag aacaccgatc ggtcatactt tggcttggcc aaagcggatg 840 gaaacaccag ccgaatcgtt ggattttacc aaattgcgcc agatggattt tgaagcacca 900 gattatgaac gttttccggc attaactttg gcaatggaat ccatcaaatc aggtggggct 960 cgtcctgctg taatgaatgc cgctaatgaa atagctgtgg cggccttcct tgataagaaa 1020 atcggttttc ttgatatcgc taaaattgtc gagaaaacat tagatcatta tacacccgca 1080 accccgtctt ctttggaaga tgtctttgcg atcgacaatg aagcgcggat acaagccgct 1140 gctttaatgg agagtttgcc cgcgtga 1167 107 1161 DNA Streptomyces griseolosporeus 107 ttggtcattc tcggctcgac cggctcgatc ggcacccagg ccatcgacgt ggtgctccgc 60 aaccccggcc ggttcaaggt ggtcgcgctg tccgcggccg gcggcgcggt ggagctgctc 120 gccgagcagg ccgtcgcact gggcgtgcac accgtcgcgg tggccgaccc ggccgccgag 180 gaagccgctg cgcgaggccc tggcggccaa ggcgcagggc gcccgctgcc gcgggtgctg 240 gcgggcccgg acgcggcgac cgagctggcc gcggcggagt gccactcggt gctgaacggc 300 atcaccggtt cgatcggcct ggccccgacg ctggccgcgc tgcgggccgg ccgggtgctg 360 gtgctggcga acaaggagtc gctgatcgtc ggcggtccgc tggtgaaggc ggtggcgcag 420 cccggccaga tcgtgccggt ggactccgag cacgccgcgc tgttccaggc gctggccggc 480 ggcgcccgcg cggaggtccg caagctggtg gtgaccgcca gcggcggccc gttccgcaac 540 cgcacccgtg agcagctggc ggccgtcacg ccggccgacg cgctggcgca cccgacctgg 600 gcgatgggcc cggtggtgac gatcaactcg gcgaccctgg tgaacaaggg cctggaggtg 660 atcgaggcgc acctgctgta cgacgtgccg ttcgaccgga tcgaggtggt ggtccatccg 720 cagtcggtcg ttcattcgat ggtggaattc gtggacggtt cgacgatggc ccaggccagc 780 ccgccggaca tgcgcatgcc gatcgcgctg ggcctcggct ggccggaccg ggtgccggac 840 gccgcccccg gctgcgactg gaccaaggcc gcgacctggg agttcttccc gctggacaac 900 gaggcgttcc cggcggtcga gctggcccgc gaggtgggta cgctcggcgg gaccgccccg 960 gcggtcttca atgccgccaa cgaggaatgt gtggacgctt tcctgaaggg cgcactgccc 1020 ttcaccggaa tcgtggacac tgtggcgaag gtggtcgccg aacacggcac accgcaatcg 1080 ggaacttcgc tcacggtgga ggacgtactc cacgcggaga gctgggcccg ggcccgggcc 1140 cgcgagctgg cggccggctg a 1161 108 1185 DNA Neisseria meningitidis 108 atgacaccac aagtcctgac catattaggc agtaccggca gcataggcga aagcacgctg 60 gacgttgtct cccgccaccc cgaaaaattc cgcgtattcg cgctggcagg gcataagcag 120 gtcgagaaat tggcggctca atgtcaaacg ttccaccccg aatatgccgt cgttgccgat 180 gccgaacacg ccgcccggct tgaagccctg ttgaaacgcg acggcacggc gactcaggtt 240 ttacacggcg cgcaggcatt ggttgacgtt gcctctgccg acgaagtcag cggtgtcatg 300 tgcgccatcg tcggggcggt ggggctgcct tccgcgctcg cagcggcgca aaaaggcaaa 360 accatttatc tggcgaacaa agagacgctg gtggtttccg gcgcgttgtt tatggaaacc 420 gcccgtgcaa acggcgcggc agtgctgccc gtcgacagcg aacacaacgc cgttttccaa 480 gttttgccgc gcgattacac aggtcgcctg aacgaacacg gcatcgcttc gattatcctg 540 accgcttccg gcggcccgtt tctgaccgcc gatttaaaca cgttcgacag cattacgccc 600 gaccaagcgg tcaaacaccc caattggcgt atgggacgca aaatctccgt cgattccgcc 660 accatgatga acaaaggttt ggagctgatt gaagcgcatt ggctgttcaa ctgtccgccc 720 gacaaactcg aagtcgtcat ccatccgcaa tctgtgatac acagcatggt gcgctaccgc 780 gacggctccg tgttggcgca actgggcaat cccgatatgc gaacgcctat cgcttattgt 840 ttgggtttgc ccgagcgcat cgattcgggt gtcggcgacc tggatttcga cgcattgtcc 900 gcgctgacct tccaaaagcc cgactttgac cgcttcccct gcctgaagct cgcctatgaa 960 gccatgaacg caggcggagc cgcgccctgc gtattgaacg ccgccaacga agccgccgtc 1020 gccgcctttt tggacggaca gattaagttt accgacattg ccaaaaccgt cgcccattgt 1080 ctttcacaag acttttcaga cggcataggc gacatagggg ggctcttggc gcaagatgcc 1140 cggacacgcg cacaagcgcg ggcatttatc ggcacactgc gctga 1185 109 1197 DNA Escherichia coli 109 atgaagcaac tcaccattct gggctcgacc ggctcgattg gttgcagcac gctggacgtg 60 gtgcgccata atcccgaaca cttccgcgta gttgcgctgg tggcaggcaa aaatgtcact 120 cgcatggtag aacagtgcct ggaattctct ccccgctatg ccgtaatgga cgatgaagcg 180 agtgcgaaac ttcttaaaac gatgctacag caacagggta gccgcaccga agtcttaagt 240 gggcaacaag ccgcttgcga tatggcagcg cttgaggatg ttgatcaggt gatggcagcc 300 attgttggcg ctgctgggct gttacctacg cttgctgcga tccgcgcggg taaaaccatt 360 ttgctggcca ataaagaatc actggttacc tgcggacgtc tgtttatgga cgccgtaaag 420 cagagcaaag cgcaattgtt accggtcgat agcgaacata acgccatttt tcagagttta 480 ccgcaaccta tccagcataa tctgggatac gctgaccttg agcaaaatgg cgtggtgtcc 540 attttactta ccgggtctgg tggccctttc cgtgagacgc cattgcgcga tttggcaaca 600 atgacgccgg atcaagcctg ccgtcatccg aactggtcga tggggcgtaa aatttctgtc 660 gattcggcta ccatgatgaa caaaggtctg gaatacattg aagcgcgttg gctgtttaac 720 gccagcgcca gccagatgga agtgctgatt cacccgcagt cagtgattca ctcaatggtg 780 cgctatcagg acggcagtgt tctggcgcag ctgggggaac cggatatgcg tacgccaatt 840 gcccacacca tggcatggcc gaatcgcgtg aactctggcg tgaagccgct cgatttttgc 900 aaactaagtg cgttgacatt tgccgcaccg gattatgatc gttatccatg cctgaaactg 960 gcgatggagg cgttcgaaca aggccaggca gcgacgacag cattgaatgc cgcaaacgaa 1020 atcaccgttg ctgcttttct tgcgcaacaa atccgcttta cggatatcgc tgcgttgaat 1080 ttatccgtac tggaaaaaat ggatatgcgc gaaccacaat gtgtggacga tgtgttatct 1140 gttgatgcga acgcgcgtga agtcgccaga aaagaggtga tgcgtctcgc aagctga 1197 110 1209 DNA Synechococcus leopoliensis 110 gtgaaagcag tgacactgct cggttcaacc ggctcgatcg ggacacaaac cctagacatt 60 cttgagcagt atcccgatcg ctttcgcctc gtagggctgg cggctggtcg taatgtggcg 120 ctgttgtcgg agcaaattcg gcggcaccga ccagagattg tggcgattca agatgcagct 180 cagctgtcgg aactgcaagc ggcgatcgca gaccttgata atccgccgct catcctgacc 240 ggtgaggcag gtgtcacgga agtggctcgc tacggtgatg ccgagattgt ggtcactggc 300 attgtcggtt gcgctggtct gctacccacg atcgccgcga tcgaagccgg caaggatatc 360 gcccttgcca acaaagaaac cctgattgca gcaggcccag tggtcctgcc actcctgcaa 420 aagcacggtg tcaccattac gcctgccgac tccgagcact ccgcgatctt tcagtgcatc 480 caagggcttt caacccatgc tgattttcgg cctgctcaag tcgtggcagg gctgcgacgg 540 attctcctga ctgccagtgg cggcgctttt cgggactggc cggtcgaacg gctgtcgcaa 600 gtaactgtcg cagatgcgct caagcatccc aactggtcga tggggcgcaa gattaccgtc 660 gactccgcca ccttgatgaa taaaggcctc gaggtgatcg aagcccacta tctcttcggc 720 ttggattacg actacatcga catcgtcatc catccccaga gcatcatcca ctcgctgatt 780 gagctagaag atacctccgt cttggcgcaa ttgggctggc cggatatgcg actgcccttg 840 ctctacgccc tctcctggcc cgatcgcctc tctactcaat ggtcggcgct cgatctggtc 900 aaagcgggca gcttggagtt ccgggaaccg gatcacgcca aatacccctg catggacttg 960 gcctacgccg ccggtcgcaa aggcggcaca atgccagccg tcttgaatgc ggcgaatgag 1020 caagccgtcg ccctcttcct agaggagcaa attcacttct cggatattcc gcgcctgatt 1080 gaacgtgcct gcgatcgcca ccaaacggag tggcaacagc aaccgagctt ggatgacatt 1140 ttggcctacg acgcttgggc acggcagttt gtgcaagcta gctatcaaag tctggaatcc 1200 gtcgtttag 1209 111 1221 DNA Mycobacterium leprae 111 gtgaacaatc cgatcgaggg gcacgctggc ggccgcctcc gcgtgctggt gttgggaagt 60 actggctcaa ttggcaccca ggcgctggaa gttatcgccg ccaatccgga ccgtttcgag 120 gtagtcgggc tggccgccgg gggcgcgcag ctggacacgc tgctgaggca gcgcgccgcg 180 accggcgtca ccaatatcgc catcgctgac gatcgcgcgg ctcagctggc cggcgacatc 240 ccttaccacg ggaccgatgc ggtcacccgg ctggttgagg agactgaggc tgacgttgtc 300 ctcaatgcgc tggtcggggc attgggtctg cgacccacac tggctgcact gcacacgggc 360 gcgcgattgg cgttggccaa caaggaatcg ctggtagctg gcggttcgct ggtgttggcc 420 gcggcgcagc caggccagat cgtgcccgta gactcggaac actccgcgct ggcgcaatgc 480 ctgcgcggtg gtacccccga cgaagttgct aagttagtgc taaccgcctc cggcgggccg 540 tttcgtggct ggaacgccgg cgacttggag cgcgttacac ccgagcaggc gggcgtccat 600 ccgacttggt caatggggac gatgaacacg ctgaactcag cgtctctggt taacaagggg 660 ctcgagctca tcgaagccaa cctgttgttc ggcattccct acgaccgcat tgaggtggtt 720 gtgcaccctc agtcaattgt tcattcgatg gtgacattca tcgacggctc gacgatcgcc 780 caagccagcc ctccggacat gaagctacct atttctttgg cgttgggctg gccacagcgg 840 gtgggtggcg ctgctcgagc ctgtgctttc actaccgcat ctacctggga attcgagccg 900 ctggacatcg atgtttttcc cgcagtcgag ctggcccggc acgctggaca gatcggcggc 960 tgtatgaccg ccatttacga tgctgctaat gaggaggctg cagaggcctt cctccaaggt 1020 cggatcggct tccccgccat cgtcgcaaca atcgcggatg tgttgcagcg tgccgaccaa 1080 tgggctcccc aatggggtga gggacccgct actgtggatg atgtactcga cgcgcagcgc 1140 tgggcccgtg agcgagcgtt gtgtgcggta gcaacagcga gttctggaaa ggtctctgac 1200 atggtcttag aaaggtccta a 1221 112 1218 DNA Pasteurella multocida 112 atgagtatta gttattttat gaaaaagatc gttattttag gttcaactgg atcgattggt 60 accagtactt tatccgtgat tacacataat cctgataagt accaagtgtt tgcgttagtt 120 ggtggacgta atgtagagct aatgtttcaa caatgtttga cattccaacc gtcgtttgct 180 gcgttagatg acgatgtcgc agccaaaatg ttggcagaga aactgaaagc ccaccaaagc 240 caaacaacag tcttagcagg acagcaagcc atttgtgagt tagcggcaca tcctgaagca 300 gatatggtaa tggctgcgat tgtgggggcg gcgggattat tgcctacttt gtctgcggtg 360 aaagctggaa aacgtgtact attagcaaat aaagaagcct tggtaacttg cgggcaatta 420 tttattgatg cagtgcgtga atctcaagca caattgttac cagtagatag tgaacataat 480 gcgattttcc aatcccttcc gcctgaagcg caaagacaaa ttgggttttg cccgctttct 540 gaattaggga tcagtaagat tgtgttaacg ggatccggtg gtccattccg ttatacccct 600 ctggagcaat ttgaacagat caccccagca caagcagttg cgcatcctaa ttggtcaatg 660 gggaaaaaga tctctgtcga ttccgctacc atgatgaata aagggttgga atatattgaa 720 gcacgctggt tatttaatgc ctcggcagaa gaaatggaag ttattattca tcctcaatcc 780 attattcatt ctatggtacg ttacatcgat gggtccgtga ttgctcaaat ggggaatcct 840 gatatgcgta caccgattgc ggaaaccatg gcatatccaa gtcggaccgt tgctggcgtt 900 gagcccttgg atttttacca actgaatgga ttaaccttta ttgagccaga ctatcaacgt 960 tatccatcca tggatcttgc ttatgctgct ggacgagctg gaggcacaac cacgacagca 1020 atgaatgcag cgaatgaaat cgcggtagcg tctttcttag acaataagat taaattcaca 1080 gatattgcgc gactaaatca gttagtcgtg agcaaattgc aaccacaaaa aattcattgc 1140 atagaagatg tacttgaggt agataaaaag gcaagggaat tatctcagtc aatcatttta 1200 agtttttcac atccgtaa 1218 113 1434 DNA Arabidopsis thaliana 113 atgatgacat taaactcact atctccagct gaatccaaag ctatttcttt cttggatacc 60 tccaggttca atccaatccc taaactctca ggtgggttta gtttgaggag gaggaatcaa 120 gggagaggtt ttggaaaagg tgttaagtgt tcagtgaaag tgcagcagca acaacaacct 180 cctccagcat ggcctgggag agctgtccct gaggcgcctc gtcaatcttg ggatggacca 240 aaacccatct ctatcgttgg atctactggt tctattggca ctcagacatt ggatattgtg 300 gctgagaatc ctgacaaatt cagagttgtg gctctagctg ctggttcgaa tgttactcta 360 cttgctgatc aggtaaggag atttaagcct gcattggttg ctgttagaaa cgagtcactg 420 attaatgagc ttaaagaggc tttagctgat ttggactata aactcgagat tattccagga 480 gagcaaggag tgattgaggt tgcccgacat cctgaagctg taaccgttgt taccggaata 540 gtaggttgtg cgggactaaa gcctacggtt gctgcaattg aagcaggaaa ggacattgct 600 cttgcaaaca aagagacatt aatcgcaggt ggtcctttcg tgcttccgct tgccaacaaa 660 cataatgtaa agattcttcc ggcagattca gaacattctg ccatatttca gtgtattcaa 720 ggtttgcctg aaggcgctct gcgcaagata atcttgactg catctggtgg agcttttagg 780 gattggcctg tcgaaaagct aaaggaagtt aaagtagcgg atgcgttgaa gcatccaaac 840 tggaacatgg gaaagaaaat cactgtggac tctgctacgc ttttcaacaa gggtcttgag 900 gtcattgaag cgcattattt gtttggagct gagtatgacg atatagagat tgtcattcat 960 ccgcaaagta tcatacattc catgattgaa acacaggatt catctgtgct tgctcaattg 1020 ggttggcctg atatgcgttt accgattctc tacaccatgt catggcccga tagagttcct 1080 tgttctgaag taacttggcc aagacttgac ctttgcaaac tcggttcatt gactttcaag 1140 aaaccagaca atgtgaaata cccatccatg gatcttgctt atgctgctgg acgagctgga 1200 ggcacaatga ctggagttct cagcgccgcc aatgagaaag ctgttgaaat gttcattgat 1260 gaaaagataa gctatttgga tatcttcaag gttgtggaat taacatgcga taaacatcga 1320 aacgagttgg taacatcacc gtctcttgaa gagattgttc actatgactt gtgggcacgt 1380 gaatatgccg cgaatgtgca gctttcttct ggtgctaggc cagttcatgc atga 1434 114 1071 DNA Campylobacter jejuni 114 atgatacttt ttggaagtac gggcagtata ggagtaaatg ctcttaaact tgctgcttta 60 aaaaacattc ccatttctgc tttagcttgt ggggataaca tcgctctttt aaatgagcaa 120 atcgcaaggt ttaaacccaa atttgtttcc ataaaagatt caaaaaataa gcatttagtt 180 aaacacgata gagtttttat agggcaagaa ggtttagagc aaattttaac agaatgtcaa 240 gataagcttt tactcaatgc cattgtaggt tttgcaggac ttaaaagcac tttaaaggct 300 aaagagcttg gcaaaaacat agctttagct aacaaagaaa gtcttgtagt agctgggagt 360 tttttgaaag gggctaaatt tttacccgtt gatagtgagc atgcagcttt aaaattttta 420 ctcgaaggta aaaaaaatat agcaaaactt tatatcacag caagtggtgg agctttttat 480 aggtataaaa tcaaagattt aaatcaagtc agtgtcaaag atgctttaaa acatcctaat 540 tggaacatgg gagcaaagat cactatagat agtgcgacta tggcaaataa gctttttgag 600 attatagagg cttatcattt atatgatttt aaagaaattg atgctttaat agaaccaaga 660 tctttagtgc atgcaatgtg tgaatttaaa aatggagcta gcacggcgta tttttcaaaa 720 gcagatatga aactagctat ttcagatgct atatttgaaa aacaagatac gcctatttta 780 gaggctgttg attttagcaa aatgcctgct ttaaaatttc atccaatcag cacaaaaaaa 840 tatcctattt ttaagcttaa aaatacattt ttaaaagagc caaatttagg tgttatcatc 900 aatgctgcta atgaagttgg tgtttataat tttttagaaa ataaaagtgg atttttagac 960 attgctaaat gcatttttaa agcccttgat cattttggag tacctaaaat ttcaagcata 1020 gaagaagttt ttgagtatga ttttaaaaca agagagtatt taaggagtta a 1071 115 1467 DNA Plasmodium falciparum 115 atgaagaaat atatttatat atattttttc ttcatcacaa taactattaa tgatttagta 60 ataaataata catcaaaatg tgtttccatt gaaagaagaa aaaataacgc atatataaat 120 tatggtatag gatataatgg accagataat aaaataacaa agagtagaag atgtaaaaga 180 ataaagttat gcaaaaagga tttaatagat attggtgcaa taaagaaacc aattaatgta 240 gcaatttttg gaagtactgg tagtataggt acgaatgctt taaatataat aagggagtgt 300 aataaaattg aaaatgtttt taatgttaaa gcattgtatg tgaataagag tgtgaatgaa 360 ttatatgaac aagctagaga atttttacca gaatatttgt gtatacatga taaaagtgta 420 tatgaagaat taaaagaact ggtaaaaaat ataaaagatt ataaacctat aatattgtgt 480 ggtgatgaag ggatgaaaga aatatgtagt agtaatagta tagataaaat agttattggt 540 attgattctt ttcaaggatt atattctact atgtatgcaa ttatgaataa taaaatagtt 600 gcgttagcta ataaagaatc cattgtctct gctggtttct ttttaaagaa attattaaat 660 attcataaaa atgcaaagat aatacctgtt gattcagaac atagtgctat atttcaatgt 720 ttagataata ataaggtatt aaaaacaaaa tgtttacaag acaatttttc taaaattaac 780 aatataaata aaatattttt atgttcatct ggaggtccat ttcaaaattt aactatggac 840 gaattaaaaa atgtaacatc agaaaatgct ttaaagcatc ctaaatggaa aatgggtaag 900 aaaataacta tagattctgc aactatgatg aataaaggtt tagaggttat agaaacccat 960 tttttatttg atgtagatta taatgatata gaagttatag tacataaaga atgcattata 1020 cattcttgtg ttgaatttat agacaaatca gtaataagtc aaatgtatta tccagatatg 1080 caaataccca tattatattc tttaacatgg cctgatagaa taaaaacaaa tttaaaacct 1140 ttagatttgg ctcaggtttc aactcttaca tttcataaac cttctttaga acatttcccg 1200 tgtattaaat tagcttatca agcaggtata aaaggaaact tttatccaac tgtactaaat 1260 gcgtcaaatg aaatagctaa caacttattt ttgaataata aaattaaata ttttgatatt 1320 tcctctataa tatcgcaagt tcttgaatct ttcaattctc aaaaggtttc ggaaaatagt 1380 gaagatttaa tgaagcaaat tctacaaata cattcttggg ccaaagataa agctaccgat 1440 atatacaaca aacataattc ttcatag 1467 116 388 PRT Zymononas mobilis 116 Met Ser Gln Pro Arg Thr Val Thr Val Leu Gly Ala Thr Gly Ser Ile 1 5 10 15 Gly His Ser Thr Leu Asp Leu Ile Glu Arg Asn Leu Asp Arg Tyr Gln 20 25 30 Val Ile Ala Leu Thr Ala Asn Arg Asn Val Lys Asp Leu Ala Asp Ala 35 40 45 Ala Lys Arg Thr Asn Ala Lys Arg Ala Val Ile Ala Asp Pro Ser Leu 50 55 60 Tyr Asn Asp Leu Lys Glu Ala Leu Ala Gly Ser Ser Val Glu Ala Ala 65 70 75 80 Ala Gly Ala Asp Ala Leu Val Glu Ala Ala Met Met Gly Ala Asp Trp 85 90 95 Thr Met Ala Ala Ile Ile Gly Cys Ala Gly Leu Lys Ala Thr Leu Ala 100 105 110 Ala Ile Arg Lys Gly Lys Thr Val Ala Leu Ala Asn Lys Glu Ser Leu 115 120 125 Val Ser Ala Gly Gly Leu Met Ile Asp Ala Val Arg Glu His Gly Thr 130 135 140 Thr Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln Cys Phe 145 150 155 160 Pro His His Asn Arg Asp Tyr Val Arg Arg Ile Ile Ile Thr Ala Ser 165 170 175 Gly Gly Pro Phe Arg Thr Thr Ser Leu Ala Glu Met Ala Thr Val Thr 180 185 190 Pro Glu Arg Ala Val Gln His Pro Asn Trp Ser Met Gly Ala Lys Ile 195 200 205 Ser Ile Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Leu Ile Glu 210 215 220 Ala Tyr His Leu Phe Gln Ile Pro Leu Glu Lys Phe Glu Ile Leu Val 225 230 235 240 His Pro Gln Ser Val Ile His Ser Met Val Glu Tyr Leu Asp Gly Ser 245 250 255 Ile Leu Ala Gln Ile Gly Ser Pro Asp Met Arg Thr Pro Ile Gly His 260 265 270 Thr Leu Ala Trp Pro Lys Arg Met Glu Thr Pro Ala Glu Ser Leu Asp 275 280 285 Phe Thr Lys Leu Arg Gln Met Asp Phe Glu Ala Pro Asp Tyr Glu Arg 290 295 300 Phe Pro Ala Leu Thr Leu Ala Met Glu Ser Ile Lys Ser Gly Gly Ala 305 310 315 320 Arg Pro Ala Val Met Asn Ala Ala Asn Glu Ile Ala Val Ala Ala Phe 325 330 335 Leu Asp Lys Lys Ile Gly Phe Leu Asp Ile Ala Lys Ile Val Glu Lys 340 345 350 Thr Leu Asp His Tyr Thr Pro Ala Thr Pro Ser Ser Leu Glu Asp Val 355 360 365 Phe Ala Ile Asp Asn Glu Ala Arg Ile Gln Ala Ala Ala Leu Met Glu 370 375 380 Ser Leu Pro Ala 385 117 396 PRT Pseudomonas aeruginosa 117 Met Ser Arg Pro Gln Arg Ile Ser Val Leu Gly Ala Thr Gly Ser Ile 1 5 10 15 Gly Leu Ser Thr Leu Asp Val Val Gln Arg His Pro Asp Arg Tyr Glu 20 25 30 Ala Phe Ala Leu Thr Gly Phe Ser Arg Leu Ala Glu Leu Glu Ala Leu 35 40 45 Cys Leu Arg His Arg Pro Val Tyr Ala Val Val Pro Glu Gln Ala Ala 50 55 60 Ala Ile Ala Leu Gln Gly Ser Leu Ala Ala Ala Gly Ile Arg Thr Arg 65 70 75 80 Val Leu Phe Gly Glu Gln Ala Leu Cys Glu Val Ala Ser Ala Pro Glu 85 90 95 Val Asp Met Val Met Ala Ala Ile Val Gly Ala Ala Gly Leu Pro Ser 100 105 110 Thr Leu Ala Ala Val Glu Ala Gly Lys Arg Val Leu Leu Ala Asn Lys 115 120 125 Glu Ala Leu Val Met Ser Gly Ala Leu Phe Met Gln Ala Val Lys Arg 130 135 140 Ser Gly Ala Val Leu Leu Pro Ile Asp Ser Glu His Asn Ala Ile Phe 145 150 155 160 Gln Ser Leu Pro Arg Asn Tyr Ala Asp Gly Leu Glu Arg Val Gly Val 165 170 175 Arg Arg Ile Leu Leu Thr Ala Ser Gly Gly Pro Phe Arg Glu Thr Pro 180 185 190 Leu Glu Gln Leu Ala Ser Val Thr Pro Glu Gln Ala Cys Ala His Pro 195 200 205 Asn Trp Ser Met Gly Arg Lys Ile Ser Val Asp Ser Ala Ser Met Met 210 215 220 Asn Lys Gly Leu Glu Leu Ile Glu Ala Cys Trp Leu Phe Asp Ala Gln 225 230 235 240 Pro Ser Gln Val Glu Val Val Ile His Pro Gln Ser Val Ile His Ser 245 250 255 Met Val Asp Tyr Val Asp Gly Ser Val Ile Ala Gln Leu Gly Asn Pro 260 265 270 Asp Met Arg Thr Pro Ile Ser Tyr Ala Met Ala Trp Pro Glu Arg Ile 275 280 285 Asp Ser Gly Val Ser Pro Leu Asp Met Phe Ala Val Gly Arg Leu Asp 290 295 300 Phe Gln Arg Pro Asp Glu Gln Arg Phe Pro Cys Leu Arg Leu Ala Ser 305 310 315 320 Gln Ala Ala Glu Thr Gly Gly Ser Ala Pro Ala Met Leu Asn Ala Ala 325 330 335 Asn Glu Val Ala Val Ala Ala Phe Leu Glu Arg His Ile Arg Phe Ser 340 345 350 Asp Ile Ala Val Ile Ile Glu Asp Val Leu Asn Arg Glu Ala Val Thr 355 360 365 Ala Val Glu Ser Leu Asp Gln Val Leu Ala Ala Asp Arg Arg Ala Arg 370 375 380 Ser Val Ala Gly Gln Trp Leu Thr Arg His Ala Gly 385 390 395 118 398 PRT Escherichia coli 118 Met Lys Gln Leu Thr Ile Leu Gly Ser Thr Gly Ser Ile Gly Cys Ser 1 5 10 15 Thr Leu Asp Val Val Arg His Asn Pro Glu His Phe Arg Val Val Ala 20 25 30 Leu Val Ala Gly Lys Asn Val Thr Arg Met Val Glu Gln Cys Leu Glu 35 40 45 Phe Ser Pro Arg Tyr Ala Val Met Asp Asp Glu Ala Ser Ala Lys Leu 50 55 60 Leu Lys Thr Met Leu Gln Gln Gln Gly Ser Arg Thr Glu Val Leu Ser 65 70 75 80 Gly Gln Gln Ala Ala Cys Asp Met Ala Ala Leu Glu Asp Val Asp Gln 85 90 95 Val Met Ala Ala Ile Val Gly Ala Ala Gly Leu Leu Pro Thr Leu Ala 100 105 110 Ala Ile Arg Ala Gly Lys Thr Ile Leu Leu Ala Asn Lys Glu Ser Leu 115 120 125 Val Thr Cys Gly Arg Leu Phe Met Asp Ala Val Lys Gln Ser Lys Ala 130 135 140 Gln Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln Ser Leu 145 150 155 160 Pro Gln Pro Ile Gln His Asn Leu Gly Tyr Ala Asp Leu Glu Gln Asn 165 170 175 Gly Val Val Ser Ile Leu Leu Thr Gly Ser Gly Gly Pro Phe Arg Glu 180 185 190 Thr Pro Leu Arg Asp Leu Ala Thr Met Thr Pro Asp Gln Ala Cys Arg 195 200 205 His Pro Asn Trp Ser Met Gly Arg Lys Ile Ser Val Asp Ser Ala Thr 210 215 220 Met Met Asn Lys Gly Leu Glu Tyr Ile Glu Ala Arg Trp Leu Phe Asn 225 230 235 240 Ala Ser Ala Ser Gln Met Glu Val Leu Ile His Pro Gln Ser Val Ile 245 250 255 His Ser Met Val Arg Tyr Gln Asp Gly Ser Val Leu Ala Gln Leu Gly 260 265 270 Glu Pro Asp Met Arg Thr Pro Ile Ala His Thr Met Ala Trp Pro Asn 275 280 285 Arg Val Asn Ser Gly Val Lys Pro Leu Asp Phe Cys Lys Leu Ser Ala 290 295 300 Leu Thr Phe Ala Ala Pro Asp Tyr Asp Arg Tyr Pro Cys Leu Lys Leu 305 310 315 320 Ala Met Glu Ala Phe Glu Gln Gly Gln Ala Ala Thr Thr Ala Leu Asn 325 330 335 Ala Ala Asn Glu Ile Thr Val Ala Ala Phe Leu Ala Gln Gln Ile Arg 340 345 350 Phe Thr Asp Ile Ala Ala Leu Asn Leu Ser Val Leu Glu Lys Met Asp 355 360 365 Met Arg Glu Pro Gln Cys Val Asp Asp Val Leu Ser Val Asp Ala Asn 370 375 380 Ala Arg Glu Val Ala Arg Lys Glu Val Met Arg Leu Ala Ser 385 390 395 119 394 PRT Neisseria meningitidis 119 Met Thr Pro Gln Val Leu Thr Ile Leu Gly Ser Thr Gly Ser Ile Gly 1 5 10 15 Glu Ser Thr Leu Asp Val Val Ser Arg His Pro Glu Lys Phe Arg Val 20 25 30 Phe Ala Leu Ala Gly His Lys Gln Val Glu Lys Leu Ala Ala Gln Cys 35 40 45 Gln Thr Phe His Pro Glu Tyr Ala Val Val Ala Asp Ala Glu His Ala 50 55 60 Ala Arg Leu Glu Ala Leu Leu Lys Arg Asp Gly Thr Ala Thr Gln Val 65 70 75 80 Leu His Gly Ala Gln Ala Leu Val Asp Val Ala Ser Ala Asp Glu Val 85 90 95 Ser Gly Val Met Cys Ala Ile Val Gly Ala Val Gly Leu Pro Ser Ala 100 105 110 Leu Ala Ala Ala Gln Lys Gly Lys Thr Ile Tyr Leu Ala Asn Lys Glu 115 120 125 Thr Leu Val Val Ser Gly Ala Leu Phe Met Glu Thr Ala Arg Ala Asn 130 135 140 Gly Ala Ala Val Leu Pro Val Asp Ser Glu His Asn Ala Val Phe Gln 145 150 155 160 Val Leu Pro Arg Asp Tyr Thr Gly Arg Leu Asn Glu His Gly Ile Ala 165 170 175 Ser Ile Ile Leu Thr Ala Ser Gly Gly Pro Phe Leu Thr Ala Asp Leu 180 185 190 Asn Thr Phe Asp Ser Ile Thr Pro Asp Gln Ala Val Lys His Pro Asn 195 200 205 Trp Arg Met Gly Arg Lys Ile Ser Val Asp Ser Ala Thr Met Met Asn 210 215 220 Lys Gly Leu Glu Leu Ile Glu Ala His Trp Leu Phe Asn Cys Pro Pro 225 230 235 240 Asp Lys Leu Glu Val Val Ile His Pro Gln Ser Val Ile His Ser Met 245 250 255 Val Arg Tyr Arg Asp Gly Ser Val Leu Ala Gln Leu Gly Asn Pro Asp 260 265 270 Met Arg Thr Pro Ile Ala Tyr Cys Leu Gly Leu Pro Glu Arg Ile Asp 275 280 285 Ser Gly Val Gly Asp Leu Asp Phe Asp Ala Leu Ser Ala Leu Thr Phe 290 295 300 Gln Lys Pro Asp Phe Asp Arg Phe Pro Cys Leu Lys Leu Ala Tyr Glu 305 310 315 320 Ala Met Asn Ala Gly Gly Ala Ala Pro Cys Val Leu Asn Ala Ala Asn 325 330 335 Glu Ala Ala Val Ala Ala Phe Leu Asp Gly Gln Ile Lys Phe Thr Asp 340 345 350 Ile Ala Lys Thr Val Ala His Cys Leu Ser Gln Asp Phe Ser Asp Gly 355 360 365 Ile Gly Asp Ile Gly Gly Leu Leu Ala Gln Asp Ala Arg Thr Arg Ala 370 375 380 Gln Ala Arg Ala Phe Ile Gly Thr Leu Arg 385 390 120 397 PRT Haemophilus influenzae 120 Met Gln Lys Gln Asn Ile Val Ile Leu Gly Ser Thr Gly Ser Ile Gly 1 5 10 15 Lys Ser Thr Leu Ser Val Ile Glu Asn Asn Pro Gln Lys Tyr His Ala 20 25 30 Phe Ala Leu Val Gly Gly Lys Asn Val Glu Ala Met Phe Glu Gln Cys 35 40 45 Ile Lys Phe Arg Pro His Phe Ala Ala Leu Asp Asp Val Asn Ala Ala 50 55 60 Lys Ile Leu Arg Glu Lys Leu Ile Ala His His Ile Pro Thr Glu Val 65 70 75 80 Leu Ala Gly Arg Arg Ala Ile Cys Glu Leu Ala Ala His Pro Asp Ala 85 90 95 Asp Gln Ile Met Ala Ser Ile Val Gly Ala Ala Gly Leu Leu Pro Thr 100 105 110 Leu Ser Ala Val Lys Ala Gly Lys Arg Val Leu Leu Ala Asn Lys Glu 115 120 125 Ser Leu Val Thr Cys Gly Gln Leu Phe Ile Asp Ala Val Lys Asn Tyr 130 135 140 Gly Ser Lys Leu Leu Pro Val Asp Ser Glu His Asn Ala Ile Phe Gln 145 150 155 160 Ser Leu Pro Pro Glu Ala Gln Glu Lys Ile Gly Phe Cys Pro Leu Ser 165 170 175 Glu Leu Gly Val Ser Lys Ile Ile Leu Thr Gly Ser Gly Gly Pro Phe 180 185 190 Arg Tyr Thr Pro Leu Glu Gln Phe Thr Asn Ile Thr Pro Glu Gln Ala 195 200 205 Val Ala His Pro Asn Trp Ser Met Gly Lys Lys Ile Ser Val Asp Ser 210 215 220 Ala Thr Met Met Asn Lys Gly Leu Glu Tyr Ile Glu Ala Arg Trp Leu 225 230 235 240 Phe Asn Ala Ser Ala Glu Glu Met Glu Val Ile Ile His Pro Gln Ser 245 250 255 Ile Ile His Ser Met Val Arg Tyr Val Asp Gly Ser Val Ile Thr Gln 260 265 270 Met Gly Asn Pro Asp Met Arg Thr Pro Ile Ala Glu Thr Met Ala Tyr 275 280 285 Pro His Arg Thr Phe Ala Gly Val Glu Pro Leu Asp Phe Phe Lys Ile 290 295 300 Lys Glu Leu Thr Phe Ile Glu Pro Asp Phe Asn Arg Tyr Pro Asn Leu 305 310 315 320 Lys Leu Ala Ile Asp Ala Phe Ala Ala Gly Gln Tyr Ala Thr Thr Ala 325 330 335 Met Asn Ala Ala Asn Glu Ile Ala Val Gln Ala Phe Leu Asp Arg Gln 340 345 350 Ile Gly Phe Met Asp Ile Ala Lys Ile Asn Ser Lys Thr Ile Glu Arg 355 360 365 Ile Ser Pro Tyr Thr Ile Gln Asn Ile Asp Asp Val Leu Glu Ile Asp 370 375 380 Ala Gln Ala Arg Glu Ile Ala Lys Thr Leu Leu Arg Glu 385 390 395 121 394 PRT Synechocystis sp. PCC 6803 121 Met Val Lys Arg Ile Ser Ile Leu Gly Ser Thr Gly Ser Ile Gly Thr 1 5 10 15 Gln Thr Leu Asp Ile Val Thr His His Pro Asp Ala Phe Gln Val Val 20 25 30 Gly Leu Ala Ala Gly Gly Asn Val Ala Leu Leu Ala Gln Gln Val Ala 35 40 45 Glu Phe Arg Pro Glu Ile Val Ala Ile Arg Gln Ala Glu Lys Leu Glu 50 55 60 Asp Leu Lys Ala Ala Val Ala Glu Leu Thr Asp Tyr Gln Pro Met Tyr 65 70 75 80 Val Val Gly Glu Glu Gly Val Val Glu Val Ala Arg Tyr Gly Asp Ala 85 90 95 Glu Ser Val Val Thr Gly Ile Val Gly Cys Ala Gly Leu Leu Pro Thr 100 105 110 Met Ala Ala Ile Ala Ala Gly Lys Asp Ile Ala Leu Ala Asn Lys Glu 115 120 125 Thr Leu Ile Ala Gly Ala Pro Val Val Leu Pro Leu Val Glu Lys Met 130 135 140 Gly Val Lys Leu Leu Pro Ala Asp Ser Glu His Ser Ala Ile Phe Gln 145 150 155 160 Cys Leu Gln Gly Val Pro Glu Gly Gly Leu Arg Arg Ile Ile Leu Thr 165 170 175 Ala Ser Gly Gly Ala Phe Arg Asp Leu Pro Val Glu Arg Leu Pro Phe 180 185 190 Val Thr Val Gln Asp Ala Leu Lys His Pro Asn Trp Ser Met Gly Gln 195 200 205 Lys Ile Thr Ile Asp Ser Ala Thr Leu Met Asn Lys Gly Leu Glu Val 210 215 220 Ile Glu Ala His Tyr Leu Phe Gly Leu Asp Tyr Asp His Ile Asp Ile 225 230 235 240 Val Ile His Pro Gln Ser Ile Ile His Ser Leu Ile Glu Val Gln Asp 245 250 255 Thr Ser Val Leu Ala Gln Leu Gly Trp Pro Asp Met Arg Leu Pro Leu 260 265 270 Leu Tyr Ala Leu Ser Trp Pro Glu Arg Ile Tyr Thr Asp Trp Glu Pro 275 280 285 Leu Asp Leu Val Lys Ala Gly Ser Leu Ser Phe Arg Glu Pro Asp His 290 295 300 Asp Lys Tyr Pro Cys Met Gln Leu Ala Tyr Gly Ala Gly Arg Ala Gly 305 310 315 320 Gly Ala Met Pro Ala Val Leu Asn Ala Ala Asn Glu Gln Ala Val Ala 325 330 335 Leu Phe Leu Gln Glu Lys Ile Ser Phe Leu Asp Ile Pro Arg Leu Ile 340 345 350 Glu Lys Thr Cys Asp Leu Tyr Val Gly Gln Asn Thr Ala Ser Pro Asp 355 360 365 Leu Glu Thr Ile Leu Ala Ala Asp Gln Trp Ala Arg Arg Thr Val Leu 370 375 380 Glu Asn Ser Ala Cys Val Ala Thr Arg Pro 385 390 122 405 PRT Pasteurella multocida 122 Met Ser Ile Ser Tyr Phe Met Lys Lys Ile Val Ile Leu Gly Ser Thr 1 5 10 15 Gly Ser Ile Gly Thr Ser Thr Leu Ser Val Ile Thr His Asn Pro Asp 20 25 30 Lys Tyr Gln Val Phe Ala Leu Val Gly Gly Arg Asn Val Glu Leu Met 35 40 45 Phe Gln Gln Cys Leu Thr Phe Gln Pro Ser Phe Ala Ala Leu Asp Asp 50 55 60 Asp Val Ala Ala Lys Met Leu Ala Glu Lys Leu Lys Ala His Gln Ser 65 70 75 80 Gln Thr Thr Val Leu Ala Gly Gln Gln Ala Ile Cys Glu Leu Ala Ala 85 90 95 His Pro Glu Ala Asp Met Val Met Ala Ala Ile Val Gly Ala Ala Gly 100 105 110 Leu Leu Pro Thr Leu Ser Ala Val Lys Ala Gly Lys Arg Val Leu Leu 115 120 125 Ala Asn Lys Glu Ala Leu Val Thr Cys Gly Gln Leu Phe Ile Asp Ala 130 135 140 Val Arg Glu Ser Gln Ala Gln Leu Leu Pro Val Asp Ser Glu His Asn 145 150 155 160 Ala Ile Phe Gln Ser Leu Pro Pro Glu Ala Gln Arg Gln Ile Gly Phe 165 170 175 Cys Pro Leu Ser Glu Leu Gly Ile Ser Lys Ile Val Leu Thr Gly Ser 180 185 190 Gly Gly Pro Phe Arg Tyr Thr Pro Leu Glu Gln Phe Glu Gln Ile Thr 195 200 205 Pro Ala Gln Ala Val Ala His Pro Asn Trp Ser Met Gly Lys Lys Ile 210 215 220 Ser Val Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Tyr Ile Glu 225 230 235 240 Ala Arg Trp Leu Phe Asn Ala Ser Ala Glu Glu Met Glu Val Ile Ile 245 250 255 His Pro Gln Ser Ile Ile His Ser Met Val Arg Tyr Ile Asp Gly Ser 260 265 270 Val Ile Ala Gln Met Gly Asn Pro Asp Met Arg Thr Pro Ile Ala Glu 275 280 285 Thr Met Ala Tyr Pro Ser Arg Thr Val Ala Gly Val Glu Pro Leu Asp 290 295 300 Phe Tyr Gln Leu Asn Gly Leu Thr Phe Ile Glu Pro Asp Tyr Gln Arg 305 310 315 320 Tyr Pro Cys Leu Lys Leu Ala Ile Asp Ala Phe Ser Ala Gly Gln Tyr 325 330 335 Ala Thr Thr Ala Met Asn Ala Ala Asn Glu Ile Ala Val Ala Ser Phe 340 345 350 Leu Asp Asn Lys Ile Lys Phe Thr Asp Ile Ala Arg Leu Asn Gln Leu 355 360 365 Val Val Ser Lys Leu Gln Pro Gln Lys Ile His Cys Ile Glu Asp Val 370 375 380 Leu Glu Val Asp Lys Lys Ala Arg Glu Leu Ser Gln Ser Ile Ile Leu 385 390 395 400 Ser Phe Ser His Pro 405 123 402 PRT Synechococcus leopoliensis 123 Met Lys Ala Val Thr Leu Leu Gly Ser Thr Gly Ser Ile Gly Thr Gln 1 5 10 15 Thr Leu Asp Ile Leu Glu Gln Tyr Pro Asp Arg Phe Arg Leu Val Gly 20 25 30 Leu Ala Ala Gly Arg Asn Val Ala Leu Leu Ser Glu Gln Ile Arg Arg 35 40 45 His Arg Pro Glu Ile Val Ala Ile Gln Asp Ala Ala Gln Leu Ser Glu 50 55 60 Leu Gln Ala Ala Ile Ala Asp Leu Asp Asn Pro Pro Leu Ile Leu Thr 65 70 75 80 Gly Glu Ala Gly Val Thr Glu Val Ala Arg Tyr Gly Asp Ala Glu Ile 85 90 95 Val Val Thr Gly Ile Val Gly Cys Ala Gly Leu Leu Pro Thr Ile Ala 100 105 110 Ala Ile Glu Ala Gly Lys Asp Ile Ala Leu Ala Asn Lys Glu Thr Leu 115 120 125 Ile Ala Ala Gly Pro Val Val Leu Pro Leu Leu Gln Lys His Gly Val 130 135 140 Thr Ile Thr Pro Ala Asp Ser Glu His Ser Ala Ile Phe Gln Cys Ile 145 150 155 160 Gln Gly Leu Ser Thr His Ala Asp Phe Arg Pro Ala Gln Val Val Ala 165 170 175 Gly Leu Arg Arg Ile Leu Leu Thr Ala Ser Gly Gly Ala Phe Arg Asp 180 185 190 Trp Pro Val Glu Arg Leu Ser Gln Val Thr Val Ala Asp Ala Leu Lys 195 200 205 His Pro Asn Trp Ser Met Gly Arg Lys Ile Thr Val Asp Ser Ala Thr 210 215 220 Leu Met Asn Lys Gly Leu Glu Val Ile Glu Ala His Tyr Leu Phe Gly 225 230 235 240 Leu Asp Tyr Asp Tyr Ile Asp Ile Val Ile His Pro Gln Ser Ile Ile 245 250 255 His Ser Leu Ile Glu Leu Glu Asp Thr Ser Val Leu Ala Gln Leu Gly 260 265 270 Trp Pro Asp Met Arg Leu Pro Leu Leu Tyr Ala Leu Ser Trp Pro Asp 275 280 285 Arg Leu Ser Thr Gln Trp Ser Ala Leu Asp Leu Val Lys Ala Gly Ser 290 295 300 Leu Glu Phe Arg Glu Pro Asp His Ala Lys Tyr Pro Cys Met Asp Leu 305 310 315 320 Ala Tyr Ala Ala Gly Arg Lys Gly Gly Thr Met Pro Ala Val Leu Asn 325 330 335 Ala Ala Asn Glu Gln Ala Val Ala Leu Phe Leu Glu Glu Gln Ile His 340 345 350 Phe Ser Asp Ile Pro Arg Leu Ile Glu Arg Ala Cys Asp Arg His Gln 355 360 365 Thr Glu Trp Gln Gln Gln Pro Ser Leu Asp Asp Ile Leu Ala Tyr Asp 370 375 380 Ala Trp Ala Arg Gln Phe Val Gln Ala Ser Tyr Gln Ser Leu Glu Ser 385 390 395 400 Val Val 124 386 PRT Streptomyces griseolosporeus 124 Met Val Ile Leu Gly Ser Thr Gly Ser Ile Gly Thr Gln Ala Ile Asp 1 5 10 15 Val Val Leu Arg Asn Pro Gly Arg Phe Lys Val Val Ala Leu Ser Ala 20 25 30 Ala Gly Gly Ala Val Glu Leu Leu Ala Glu Gln Ala Val Ala Leu Gly 35 40 45 Val His Thr Val Ala Val Ala Asp Pro Ala Ala Glu Glu Ala Ala Ala 50 55 60 Arg Gly Pro Gly Gly Gln Gly Ala Gly Arg Pro Leu Pro Arg Val Leu 65 70 75 80 Ala Gly Pro Asp Ala Ala Thr Glu Leu Ala Ala Ala Glu Cys His Ser 85 90 95 Val Leu Asn Gly Ile Thr Gly Ser Ile Gly Leu Ala Pro Thr Leu Ala 100 105 110 Ala Leu Arg Ala Gly Arg Val Leu Val Leu Ala Asn Lys Glu Ser Leu 115 120 125 Ile Val Gly Gly Pro Leu Val Lys Ala Val Ala Gln Pro Gly Gln Ile 130 135 140 Val Pro Val Asp Ser Glu His Ala Ala Leu Phe Gln Ala Leu Ala Gly 145 150 155 160 Gly Ala Arg Ala Glu Val Arg Lys Leu Val Val Thr Ala Ser Gly Gly 165 170 175 Pro Phe Arg Asn Arg Thr Arg Glu Gln Leu Ala Ala Val Thr Pro Ala 180 185 190 Asp Ala Leu Ala His Pro Thr Trp Ala Met Gly Pro Val Val Thr Ile 195 200 205 Asn Ser Ala Thr Leu Val Asn Lys Gly Leu Glu Val Ile Glu Ala His 210 215 220 Leu Leu Tyr Asp Val Pro Phe Asp Arg Ile Glu Val Val Val His Pro 225 230 235 240 Gln Ser Val Val His Ser Met Val Glu Phe Val Asp Gly Ser Thr Met 245 250 255 Ala Gln Ala Ser Pro Pro Asp Met Arg Met Pro Ile Ala Leu Gly Leu 260 265 270 Gly Trp Pro Asp Arg Val Pro Asp Ala Ala Pro Gly Cys Asp Trp Thr 275 280 285 Lys Ala Ala Thr Trp Glu Phe Phe Pro Leu Asp Asn Glu Ala Phe Pro 290 295 300 Ala Val Glu Leu Ala Arg Glu Val Gly Thr Leu Gly Gly Thr Ala Pro 305 310 315 320 Ala Val Phe Asn Ala Ala Asn Glu Glu Cys Val Asp Ala Phe Leu Lys 325 330 335 Gly Ala Leu Pro Phe Thr Gly Ile Val Asp Thr Val Ala Lys Val Val 340 345 350 Ala Glu His Gly Thr Pro Gln Ser Gly Thr Ser Leu Thr Val Glu Asp 355 360 365 Val Leu His Ala Glu Ser Trp Ala Arg Ala Arg Ala Arg Glu Leu Ala 370 375 380 Ala Gly 385 125 388 PRT Bacillus subtilis 125 Met Lys Asn Ile Cys Leu Leu Gly Ala Thr Gly Ser Ile Gly Glu Gln 1 5 10 15 Thr Leu Asp Val Leu Arg Ala His Gln Asp Gln Phe Gln Leu Val Ser 20 25 30 Met Ser Phe Gly Arg Asn Ile Asp Lys Ala Val Pro Met Ile Glu Val 35 40 45 Phe Gln Pro Lys Phe Val Ser Val Gly Asp Leu Asp Thr Tyr His Lys 50 55 60 Leu Lys Gln Met Ser Phe Ser Phe Glu Cys Gln Ile Gly Leu Gly Glu 65 70 75 80 Glu Gly Leu Ile Glu Ala Ala Val Met Glu Glu Val Asp Ile Val Val 85 90 95 Asn Ala Leu Leu Gly Ser Val Gly Leu Ile Pro Thr Leu Lys Ala Ile 100 105 110 Glu Gln Lys Lys Thr Ile Ala Leu Ala Asn Lys Glu Thr Leu Val Thr 115 120 125 Ala Gly His Ile Val Lys Glu His Ala Lys Lys Tyr Asp Val Pro Leu 130 135 140 Leu Pro Val Asp Ser Glu His Ser Ala Ile Phe Gln Ala Leu Gln Gly 145 150 155 160 Glu Gln Ala Lys Asn Ile Glu Arg Leu Ile Ile Thr Ala Ser Gly Gly 165 170 175 Ser Phe Arg Asp Lys Thr Arg Glu Glu Leu Glu Ser Val Thr Val Glu 180 185 190 Asp Ala Leu Lys His Pro Asn Trp Ser Met Gly Ala Lys Ile Thr Ile 195 200 205 Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Val Ile Glu Ala His 210 215 220 Trp Leu Phe Asp Ile Pro Tyr Glu Gln Ile Asp Val Val Leu His Lys 225 230 235 240 Glu Ser Ile Ile His Ser Met Val Glu Phe His Asp Lys Ser Val Ile 245 250 255 Ala Gln Leu Gly Thr Pro Asp Met Arg Val Pro Ile Gln Tyr Ala Leu 260 265 270 Thr Tyr Pro Asp Arg Leu Pro Leu Pro Asp Ala Lys Arg Leu Glu Leu 275 280 285 Trp Glu Ile Gly Ser Leu His Phe Glu Lys Ala Asp Phe Asp Arg Phe 290 295 300 Arg Cys Leu Gln Phe Ala Phe Glu Ser Gly Lys Ile Gly Gly Thr Met 305 310 315 320 Pro Thr Val Leu Asn Ala Ala Asn Glu Val Ala Val Ala Ala Phe Leu 325 330 335 Ala Gly Lys Ile Pro Phe Leu Ala Ile Glu Asp Cys Ile Glu Lys Ala 340 345 350 Leu Thr Arg His Gln Leu Leu Lys Lys Pro Ser Trp Arg Thr Phe Lys 355 360 365 Lys Trp Thr Lys Ile Pro Gly Asp Thr Ser Ile Gln Tyr Ser His Lys 370 375 380 Val Val Cys Ser 385 126 406 PRT Mycobacterium leprae 126 Met Asn Asn Pro Ile Glu Gly His Ala Gly Gly Arg Leu Arg Val Leu 1 5 10 15 Val Leu Gly Ser Thr Gly Ser Ile Gly Thr Gln Ala Leu Glu Val Ile 20 25 30 Ala Ala Asn Pro Asp Arg Phe Glu Val Val Gly Leu Ala Ala Gly Gly 35 40 45 Ala Gln Leu Asp Thr Leu Leu Arg Gln Arg Ala Ala Thr Gly Val Thr 50 55 60 Asn Ile Ala Ile Ala Asp Asp Arg Ala Ala Gln Leu Ala Gly Asp Ile 65 70 75 80 Pro Tyr His Gly Thr Asp Ala Val Thr Arg Leu Val Glu Glu Thr Glu 85 90 95 Ala Asp Val Val Leu Asn Ala Leu Val Gly Ala Leu Gly Leu Arg Pro 100 105 110 Thr Leu Ala Ala Leu His Thr Gly Ala Arg Leu Ala Leu Ala Asn Lys 115 120 125 Glu Ser Leu Val Ala Gly Gly Ser Leu Val Leu Ala Ala Ala Gln Pro 130 135 140 Gly Gln Ile Val Pro Val Asp Ser Glu His Ser Ala Leu Ala Gln Cys 145 150 155 160 Leu Arg Gly Gly Thr Pro Asp Glu Val Ala Lys Leu Val Leu Thr Ala 165 170 175 Ser Gly Gly Pro Phe Arg Gly Trp Asn Ala Gly Asp Leu Glu Arg Val 180 185 190 Thr Pro Glu Gln Ala Gly Val His Pro Thr Trp Ser Met Gly Thr Met 195 200 205 Asn Thr Leu Asn Ser Ala Ser Leu Val Asn Lys Gly Leu Glu Leu Ile 210 215 220 Glu Ala Asn Leu Leu Phe Gly Ile Pro Tyr Asp Arg Ile Glu Val Val 225 230 235 240 Val His Pro Gln Ser Ile Val His Ser Met Val Thr Phe Ile Asp Gly 245 250 255 Ser Thr Ile Ala Gln Ala Ser Pro Pro Asp Met Lys Leu Pro Ile Ser 260 265 270 Leu Ala Leu Gly Trp Pro Gln Arg Val Gly Gly Ala Ala Arg Ala Cys 275 280 285 Ala Phe Thr Thr Ala Ser Thr Trp Glu Phe Glu Pro Leu Asp Ile Asp 290 295 300 Val Phe Pro Ala Val Glu Leu Ala Arg His Ala Gly Gln Ile Gly Gly 305 310 315 320 Cys Met Thr Ala Ile Tyr Asp Ala Ala Asn Glu Glu Ala Ala Glu Ala 325 330 335 Phe Leu Gln Gly Arg Ile Gly Phe Pro Ala Ile Val Ala Thr Ile Ala 340 345 350 Asp Val Leu Gln Arg Ala Asp Gln Trp Ala Pro Gln Trp Gly Glu Gly 355 360 365 Pro Ala Thr Val Asp Asp Val Leu Asp Ala Gln Arg Trp Ala Arg Glu 370 375 380 Arg Ala Leu Cys Ala Val Ala Thr Ala Ser Ser Gly Lys Val Ser Asp 385 390 395 400 Met Val Leu Glu Arg Ser 405 127 436 PRT Mycobacterium tuberculosis 127 Met Ala Thr Gly Gly Arg Val Val Ile Arg Arg Arg Gly Asp Asn Glu 1 5 10 15 Val Val Ala His Asn Asp Glu Val Thr Asn Ser Thr Asp Gly Arg Ala 20 25 30 Asp Gly Arg Leu Arg Val Val Val Leu Gly Ser Thr Gly Ser Ile Gly 35 40 45 Thr Gln Ala Leu Gln Val Ile Ala Asp Asn Pro Asp Arg Phe Glu Val 50 55 60 Val Gly Leu Ala Ala Gly Gly Ala His Leu Asp Thr Leu Leu Arg Gln 65 70 75 80 Arg Ala Gln Thr Gly Val Thr Asn Ile Ala Val Ala Asp Glu His Ala 85 90 95 Ala Gln Arg Val Gly Asp Ile Pro Tyr His Gly Ser Asp Ala Ala Thr 100 105 110 Arg Leu Val Glu Gln Thr Glu Ala Asp Val Val Leu Asn Ala Leu Val 115 120 125 Gly Ala Leu Gly Leu Arg Pro Thr Leu Ala Ala Leu Lys Thr Gly Ala 130 135 140 Arg Leu Ala Leu Ala Asn Lys Glu Ser Leu Val Ala Gly Gly Ser Leu 145 150 155 160 Val Leu Arg Ala Ala Arg Pro Gly Gln Ile Val Pro Val Asp Ser Glu 165 170 175 His Ser Ala Leu Ala Gln Cys Leu Arg Gly Gly Thr Pro Asp Glu Val 180 185 190 Ala Lys Leu Val Leu Thr Ala Ser Gly Gly Pro Phe Arg Gly Trp Ser 195 200 205 Ala Ala Asp Leu Glu His Val Thr Pro Glu Gln Ala Gly Ala His Pro 210 215 220 Thr Trp Ser Met Gly Pro Met Asn Thr Leu Asn Ser Ala Ser Leu Val 225 230 235 240 Asn Lys Gly Leu Glu Val Ile Glu Thr His Leu Leu Phe Gly Ile Pro 245 250 255 Tyr Asp Arg Ile Asp Val Val Val His Pro Gln Ser Ile Ile His Ser 260 265 270 Met Val Thr Phe Ile Asp Gly Ser Thr Ile Ala Gln Ala Ser Pro Pro 275 280 285 Asp Met Lys Leu Pro Ile Ser Leu Ala Leu Gly Trp Pro Arg Arg Val 290 295 300 Ser Gly Ala Ala Ala Ala Cys Asp Phe His Thr Ala Ser Ser Trp Glu 305 310 315 320 Phe Glu Pro Leu Asp Thr Asp Val Phe Pro Ala Val Glu Leu Ala Arg 325 330 335 Gln Ala Gly Val Ala Gly Gly Cys Met Thr Ala Val Tyr Asn Ala Ala 340 345 350 Asn Glu Glu Ala Ala Ala Ala Phe Leu Ala Gly Arg Ile Gly Phe Pro 355 360 365 Ala Ile Val Gly Ile Ile Ala Asp Val Leu His Ala Ala Asp Gln Trp 370 375 380 Ala Val Glu Pro Ala Thr Val Asp Asp Val Leu Asp Ala Gln Arg Trp 385 390 395 400 Ala Arg Glu Arg Ala Gln Arg Ala Val Ser Gly Met Ala Ser Val Ala 405 410 415 Ile Ala Ser Thr Ala Lys Pro Gly Ala Ala Gly Arg His Ala Ser Thr 420 425 430 Leu Glu Arg Ser 435 128 477 PRT Arabidopsis thaliana 128 Met Met Thr Leu Asn Ser Leu Ser Pro Ala Glu Ser Lys Ala Ile Ser 1 5 10 15 Phe Leu Asp Thr Ser Arg Phe Asn Pro Ile Pro Lys Leu Ser Gly Gly 20 25 30 Phe Ser Leu Arg Arg Arg Asn Gln Gly Arg Gly Phe Gly Lys Gly Val 35 40 45 Lys Cys Ser Val Lys Val Gln Gln Gln Gln Gln Pro Pro Pro Ala Trp 50 55 60 Pro Gly Arg Ala Val Pro Glu Ala Pro Arg Gln Ser Trp Asp Gly Pro 65 70 75 80 Lys Pro Ile Ser Ile Val Gly Ser Thr Gly Ser Ile Gly Thr Gln Thr 85 90 95 Leu Asp Ile Val Ala Glu Asn Pro Asp Lys Phe Arg Val Val Ala Leu 100 105 110 Ala Ala Gly Ser Asn Val Thr Leu Leu Ala Asp Gln Val Arg Arg Phe 115 120 125 Lys Pro Ala Leu Val Ala Val Arg Asn Glu Ser Leu Ile Asn Glu Leu 130 135 140 Lys Glu Ala Leu Ala Asp Leu Asp Tyr Lys Leu Glu Ile Ile Pro Gly 145 150 155 160 Glu Gln Gly Val Ile Glu Val Ala Arg His Pro Glu Ala Val Thr Val 165 170 175 Val Thr Gly Ile Val Gly Cys Ala Gly Leu Lys Pro Thr Val Ala Ala 180 185 190 Ile Glu Ala Gly Lys Asp Ile Ala Leu Ala Asn Lys Glu Thr Leu Ile 195 200 205 Ala Gly Gly Pro Phe Val Leu Pro Leu Ala Asn Lys His Asn Val Lys 210 215 220 Ile Leu Pro Ala Asp Ser Glu His Ser Ala Ile Phe Gln Cys Ile Gln 225 230 235 240 Gly Leu Pro Glu Gly Ala Leu Arg Lys Ile Ile Leu Thr Ala Ser Gly 245 250 255 Gly Ala Phe Arg Asp Trp Pro Val Glu Lys Leu Lys Glu Val Lys Val 260 265 270 Ala Asp Ala Leu Lys His Pro Asn Trp Asn Met Gly Lys Lys Ile Thr 275 280 285 Val Asp Ser Ala Thr Leu Phe Asn Lys Gly Leu Glu Val Ile Glu Ala 290 295 300 His Tyr Leu Phe Gly Ala Glu Tyr Asp Asp Ile Glu Ile Val Ile His 305 310 315 320 Pro Gln Ser Ile Ile His Ser Met Ile Glu Thr Gln Asp Ser Ser Val 325 330 335 Leu Ala Gln Leu Gly Trp Pro Asp Met Arg Leu Pro Ile Leu Tyr Thr 340 345 350 Met Ser Trp Pro Asp Arg Val Pro Cys Ser Glu Val Thr Trp Pro Arg 355 360 365 Leu Asp Leu Cys Lys Leu Gly Ser Leu Thr Phe Lys Lys Pro Asp Asn 370 375 380 Val Lys Tyr Pro Ser Met Asp Leu Ala Tyr Ala Ala Gly Arg Ala Gly 385 390 395 400 Gly Thr Met Thr Gly Val Leu Ser Ala Ala Asn Glu Lys Ala Val Glu 405 410 415 Met Phe Ile Asp Glu Lys Ile Ser Tyr Leu Asp Ile Phe Lys Val Val 420 425 430 Glu Leu Thr Cys Asp Lys His Arg Asn Glu Leu Val Thr Ser Pro Ser 435 440 445 Leu Glu Glu Ile Val His Tyr Asp Leu Trp Ala Arg Glu Tyr Ala Ala 450 455 460 Asn Val Gln Leu Ser Ser Gly Ala Arg Pro Val His Ala 465 470 475 129 19 DNA Artificial Sequence Primer 129 cgtaacgctc ggtctcgtc 19 130 356 PRT Campylobacter jejuni 130 Met Ile Leu Phe Gly Ser Thr Gly Ser Ile Gly Val Asn Ala Leu Lys 1 5 10 15 Leu Ala Ala Leu Lys Asn Ile Pro Ile Ser Ala Leu Ala Cys Gly Asp 20 25 30 Asn Ile Ala Leu Leu Asn Glu Gln Ile Ala Arg Phe Lys Pro Lys Phe 35 40 45 Val Ser Ile Lys Asp Ser Lys Asn Lys His Leu Val Lys His Asp Arg 50 55 60 Val Phe Ile Gly Gln Glu Gly Leu Glu Gln Ile Leu Thr Glu Cys Gln 65 70 75 80 Asp Lys Leu Leu Leu Asn Ala Ile Val Gly Phe Ala Gly Leu Lys Ser 85 90 95 Thr Leu Lys Ala Lys Glu Leu Gly Lys Asn Ile Ala Leu Ala Asn Lys 100 105 110 Glu Ser Leu Val Val Ala Gly Ser Phe Leu Lys Gly Ala Lys Phe Leu 115 120 125 Pro Val Asp Ser Glu His Ala Ala Leu Lys Phe Leu Leu Glu Gly Lys 130 135 140 Lys Asn Ile Ala Lys Leu Tyr Ile Thr Ala Ser Gly Gly Ala Phe Tyr 145 150 155 160 Arg Tyr Lys Ile Lys Asp Leu Asn Gln Val Ser Val Lys Asp Ala Leu 165 170 175 Lys His Pro Asn Trp Asn Met Gly Ala Lys Ile Thr Ile Asp Ser Ala 180 185 190 Thr Met Ala Asn Lys Leu Phe Glu Ile Ile Glu Ala Tyr His Leu Tyr 195 200 205 Asp Phe Lys Glu Ile Asp Ala Leu Ile Glu Pro Arg Ser Leu Val His 210 215 220 Ala Met Cys Glu Phe Lys Asn Gly Ala Ser Thr Ala Tyr Phe Ser Lys 225 230 235 240 Ala Asp Met Lys Leu Ala Ile Ser Asp Ala Ile Phe Glu Lys Gln Asp 245 250 255 Thr Pro Ile Leu Glu Ala Val Asp Phe Ser Lys Met Pro Ala Leu Lys 260 265 270 Phe His Pro Ile Ser Thr Lys Lys Tyr Pro Ile Phe Lys Leu Lys Asn 275 280 285 Thr Phe Leu Lys Glu Pro Asn Leu Gly Val Ile Ile Asn Ala Ala Asn 290 295 300 Glu Val Gly Val Tyr Asn Phe Leu Glu Asn Lys Ser Gly Phe Leu Asp 305 310 315 320 Ile Ala Lys Cys Ile Phe Lys Ala Leu Asp His Phe Gly Val Pro Lys 325 330 335 Ile Ser Ser Ile Glu Glu Val Phe Glu Tyr Asp Phe Lys Thr Arg Glu 340 345 350 Tyr Leu Arg Ser 355 131 486 PRT Plasmodium falciparum 131 Met Lys Lys Tyr Ile Tyr Ile Tyr Phe Phe Phe Ile Thr Ile Thr Ile 1 5 10 15 Asn Asp Leu Val Ile Asn Asn Thr Ser Lys Cys Val Ser Ile Glu Arg 20 25 30 Arg Lys Asn Asn Ala Tyr Ile Asn Tyr Gly Ile Gly Tyr Asn Gly Pro 35 40 45 Asp Asn Lys Ile Thr Lys Ser Arg Arg Cys Lys Arg Ile Lys Leu Cys 50 55 60 Lys Lys Asp Leu Ile Asp Ile Gly Ala Ile Lys Lys Pro Ile Asn Val 65 70 75 80 Ala Ile Phe Gly Ser Thr Gly Ser Ile Gly Thr Asn Ala Leu Asn Ile 85 90 95 Ile Arg Glu Cys Asn Lys Ile Glu Asn Val Phe Asn Val Lys Ala Leu 100 105 110 Tyr Val Asn Lys Ser Val Asn Glu Leu Tyr Glu Gln Ala Arg Glu Phe 115 120 125 Leu Pro Glu Tyr Leu Cys Ile His Asp Lys Ser Val Tyr Glu Glu Leu 130 135 140 Lys Glu Leu Val Lys Asn Ile Lys Asp Tyr Lys Pro Ile Ile Leu Cys 145 150 155 160 Gly Asp Glu Gly Met Lys Glu Ile Cys Ser Ser Asn Ser Ile Asp Lys 165 170 175 Ile Val Ile Gly Ile Asp Ser Phe Gln Gly Leu Tyr Ser Thr Met Tyr 180 185 190 Ala Ile Met Asn Asn Lys Ile Val Ala Leu Ala Asn Lys Glu Ser Ile 195 200 205 Val Ser Ala Gly Phe Phe Leu Lys Lys Leu Leu Asn Ile His Lys Asn 210 215 220 Ala Lys Ile Ile Pro Val Asp Ser Glu His Ser Ala Ile Phe Gln Cys 225 230 235 240 Leu Asp Asn Asn Lys Val Leu Lys Thr Lys Cys Leu Gln Asp Asn Phe 245 250 255 Ser Lys Ile Asn Asn Ile Asn Lys Ile Phe Leu Cys Ser Ser Gly Gly 260 265 270 Pro Phe Gln Asn Leu Thr Met Asp Glu Leu Lys Asn Val Thr Ser Glu 275 280 285 Asn Ala Leu Lys His Pro Lys Trp Lys Met Gly Lys Lys Ile Thr Ile 290 295 300 Asp Ser Ala Thr Met Met Asn Lys Gly Leu Glu Val Ile Glu Thr His 305 310 315 320 Phe Leu Phe Asp Val Asp Tyr Asn Asp Ile Glu Val Ile Val His Lys 325 330 335 Glu Cys Ile Ile His Ser Cys Val Glu Phe Ile Asp Lys Ser Val Ile 340 345 350 Ser Gln Met Tyr Tyr Pro Asp Met Gln Ile Pro Ile Leu Tyr Ser Leu 355 360 365 Thr Trp Pro Asp Arg Ile Lys Thr Asn Leu Lys Pro Leu Asp Leu Ala 370 375 380 Gln Val Ser Thr Leu Thr Phe His Lys Pro Ser Leu Glu His Phe Pro 385 390 395 400 Cys Ile Lys Leu Ala Tyr Gln Ala Gly Ile Lys Gly Asn Phe Tyr Pro 405 410 415 Thr Val Leu Asn Ala Ser Asn Glu Ile Ala Asn Asn Leu Phe Leu Asn 420 425 430 Asn Lys Ile Lys Tyr Phe Asp Ile Ser Ser Ile Ile Ser Gln Val Leu 435 440 445 Glu Ser Phe Asn Ser Gln Lys Val Ser Glu Asn Ser Glu Asp Leu Met 450 455 460 Lys Gln Ile Leu Gln Ile His Ser Trp Ala Lys Asp Lys Ala Thr Asp 465 470 475 480 Ile Tyr Asn Lys His Asn 485 132 24 DNA Artificial Sequence Primer 132 ccsgtsgayw ssgarcayaa cgcs 24 133 21 DNA Artificial Sequence Primer 133 atgatgaaca agggsctsga r 21 134 20 DNA Artificial Sequence Primer 134 catccvaact ggwmvatggg 20 135 20 DNA Artificial Sequence Primer 135 atyggyrwwc kcatatcmgg 20 136 27 DNA Artificial Sequence Primer 136 cgaatggacg acggattggc gatggac 27 137 29 DNA Artificial Sequence Primer 137 tcagttcgag ccccttgttc atcatcgtc 29 138 28 DNA Artificial Sequence Primer 138 cgaactgatc gaagccttcc acctgttc 28 139 27 DNA Artificial Sequence Primer 139 ggtccatcgc caatccgtcg tccattc 27 140 30 DNA Artificial Sequence Primer 140 attatctaga atccgccccg cctccacctc 30 141 30 DNA Artificial Sequence Primer 141 gatggatcct gggtagggtc gctgctgtcc 30 142 34 DNA Artificial Sequence Primer 142 ttatctagaa ccgtctacgc cgacctcgtt caac 34 143 30 DNA Artificial Sequence Primer 143 ttaggatccc ctccgctggt ccgattgaac 30 144 20 DNA Artificial Sequence Primer 144 aggcgattaa gttgggtaac 20 145 18 DNA Artificial Sequence Primer 145 gaccatgatt acgccaag 18 146 30 DNA Artificial Sequence Primer 146 gataatcgat gtgtgactga cctgtccaac 30 147 30 DNA Artificial Sequence Primer 147 cttaggtacc atgttggaga ttcaaggtgg 30 148 30 DNA Artificial Sequence Primer 148 ctagatcgat acttgcggtc ggactgatag 30 149 30 DNA Artificial Sequence Primer 149 gtcgctcgag atcagataat cgtcgctcaa 30 150 30 DNA Artificial Sequence Primer 150 atatggtacc gacatggacg aggaagacgc 30 151 41 DNA Artificial Sequence Primer 151 tagagaattc gaaggaagag catgggattg gacgaggttt c 41 152 41 DNA Artificial Sequence Primer 152 tactacttgt atgtaggtac cacttgcggt cggactgata g 41 153 41 DNA Artificial Sequence Primer 153 gcgtgatatc gaaggaagag catgagcgca accgtccacc g 41 154 41 DNA Artificial Sequence Primer 154 actgctaggg tccgaggtac cgacatggac gaggaagacg c 41 155 42 DNA Artificial Sequence Primer 155 gatgatatcg aaggaagagc atggtgaagc gcgtcacggt gt 42 156 40 DNA Artificial Sequence Primer 156 caagagtcag aaggtacccg ccagaatggt gagcaggatg 40 157 40 DNA Artificial Sequence Primer 157 caagagtcag aatctagacg ccagaatggt gagcaggatg 40 158 39 DNA Artificial Sequence Primer 158 ctagatatcg gaaggaagag catgtcacac cccgcgtta 39 159 36 DNA Artificial Sequence Primer 159 tcaggtaccg tgtcgccacc cacaacgccc ataatg 36 160 34 DNA Artificial Sequence Primer 160 tagggtacca ccgtctacgc cgacctcgtt caac 34 161 34 DNA Artificial Sequence Primer 161 tgtatgcatg tcgccaccca caacgcccat aatg 34 162 43 DNA Artificial Sequence Primer 162 gacgaagctt gaaggaagag catgcctccc ctcaccctct atc 43 163 43 DNA Artificial Sequence Primer 163 gtcactgaat gaatggtacc gcagccgaga accgccagaa gcc 43 164 34 DNA Artificial Sequence Primer 164 tagggtacca ccgtctacgc cgacctcgtt gaac 34 165 34 DNA Artificial Sequence Primer 165 aggcaatgca tgcagccgag aaccgccaga agcc 34 166 32 DNA Artificial Sequence Primer 166 aagcatgcga aaaagttgac acctgtggag tc 32 167 29 DNA Artificial Sequence Primer 167 actctagaag cacctgcgaa tggacgaag 29 168 31 DNA Artificial Sequence Primer 168 ataaaggcct tacatggcga tagctagact g 31 169 31 DNA Artificial Sequence Primer 169 aaggctcgag aaggatctta ccgctgttga g 31 170 32 DNA Artificial Sequence Primer 170 atagtactga aaaagttgac acctgtggag tc 32 171 29 DNA Artificial Sequence Primer 171 atagtactag cacctgcgaa tggacgaag 29 172 40 DNA Artificial Sequence Primer 172 cgtggcatgc gtgtaagaaa aagttgacac ctgtggagtc 40 173 40 DNA Artificial Sequence Primer 173 ctaagagctc agttcgggct cggtctcgcc tttcaggaag 40 174 40 DNA Artificial Sequence Primer 174 gagagcgaga gccagatcaa gaagsggctg aaggacatcc 40 175 40 DNA Artificial Sequence Primer 175 ggatgtcctt cagccscttc ttgatctggc tctcgctctc 40 176 30 DNA Artificial Sequence Primer 176 agtcagtact aactggtgaa gacgctgaag 30 177 30 DNA Artificial Sequence Primer 177 gatcagtact gtgaacgaat acgatacgca 30 178 40 DNA Artificial Sequence Primer 178 gtcaaatgag ctccaaactg gtgaagacgc tgaaggacat 40 179 40 DNA Artificial Sequence Primer 179 cagtcgggca tgcgtccatt tcagttgaca tacttctgtg 40 180 41 DNA Artificial Sequence Primer 180 ctcttgctcg gcggcgtgcg gctctatcac gagggggtgg a 41 181 41 DNA Artificial Sequence Primer 181 tccaccccct cgtgatagag ccgcacgccg ccgagcaaga g 41 182 20 DNA Artificial Sequence Primer 182 gagcagcaca ctctgggagc 20 183 21 DNA Artificial Sequence Primer 183 ccacacaggt aggacaccca c 21 184 40 DNA Artificial Sequence Primer 184 tcagagctcg tgtgatcgaa tggggctttg ttccttgatg 40 185 39 DNA Artificial Sequence Primer 185 gaagcatgca ggtgatcgac gtgccactcg tccgaatag 39 186 20 DNA Artificial Sequence Primer 186 ctcacaacct ccaaccgatg 20 187 20 DNA Artificial Sequence Exemplary motif 187 aggtcgtgta ctgtcagtca 20 188 20 DNA Artificial Sequence Exemplary motif 188 acgtggtgaa ctgccagtga 20 189 40 DNA Artificial Sequence Primer 189 actatcgatg aaggaagagc atggctgacc tacccaagac 40 190 30 DNA Artificial Sequence Primer 190 actagaattc cgcaacagtt ccttcatgtc 30 

What is claimed is:
 1. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO: 1 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (3626, 100), point B has coordinates (3626, 65), point C has coordinates (50, 65), and point D has coordinates (12, 100).
 2. The isolated nucleic acid of claim 1, wherein said point B has coordinates (3626, 85).
 3. The isolated nucleic acid of claim 1, wherein said point C has coordinates (100, 65).
 4. The isolated nucleic acid of claim 1, wherein said point C has coordinates (50, 85).
 5. The isolated nucleic acid of claim 1, wherein said point D has coordinates (15, 100).
 6. The isolated nucleic acid of claim 1, wherein said nucleic acid sequence encodes a polypeptide.
 7. The isolated nucleic acid of claim 6, wherein said polypeptide has DXS activity.
 8. The isolated nucleic acid of claim 1, wherein said nucleic acid sequence is as set forth in SEQ ID NO:
 1. 9. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:2 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1926, 100), point B has coordinates (1926, 65), point C has coordinates (50, 65), and point D has coordinates (12, 100).
 10. The isolated nucleic acid of claim 9, wherein said nucleic acid sequence encodes a polypeptide.
 11. The isolated nucleic acid of claim 10, wherein said polypeptide has DXS activity.
 12. An isolated nucleic acid comprising a nucleic acid sequence, wherein said nucleic acid sequence encodes a polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:3 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (641, 100), point B has coordinates (641, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 13. The isolated nucleic acid of claim 12, wherein said polypeptide has DXS activity.
 14. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:37 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1990, 100), point B has coordinates (1990, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 15. The isolated nucleic acid of claim 14, wherein said point B has coordinates (1990, 85).
 16. The isolated nucleic acid of claim 14, wherein said point C has coordinates (100, 55).
 17. The isolated nucleic acid of claim 14, wherein said point C has coordinates (50, 85).
 18. The isolated nucleic acid of claim 14, wherein said point D has coordinates (20, 100).
 19. The isolated nucleic acid of claim 14, wherein said nucleic acid sequence encodes a polypeptide.
 20. The isolated nucleic acid of claim 19, wherein said polypeptide has DDS activity.
 21. The isolated nucleic acid of claim 14, wherein said nucleic acid sequence is as set forth in SEQ ID NO:37.
 22. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:38 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1002, 100), point B has coordinates (1002, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 23. The isolated nucleic acid of claim 22, wherein said nucleic acid sequence encodes a polypeptide.
 24. The isolated nucleic acid of claim 23, wherein said polypeptide has DDS activity.
 25. An isolated nucleic acid comprising a nucleic acid sequence, wherein said nucleic acid sequence encodes a polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:39 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (333, 100), point B has coordinates (333, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 26. The isolated nucleic acid of claim 25, wherein said polypeptide has DDS activity.
 27. Al isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:40 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1833, 100), point B has coordinates (1833, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 28. The isolated nucleic acid of claim 27, wherein said point B has coordinates (1833, 85).
 29. The isolated nucleic acid of claim 27, wherein said point C has coordinates (100, 65).
 30. The isolated nucleic acid of claim 27, wherein said point C has coordinates (50, 85).
 31. The isolated nucleic acid of claim 27, wherein said point D has coordinates (20, 100).
 32. The isolated nucleic acid of claim 27, wherein said nucleic acid sequence encodes a polypeptide.
 33. The isolated nucleic acid of claim 32, wherein said polypeptide has DDS activity.
 34. The isolated nucleic acid of claim 27, wherein said nucleic acid sequence is as set forth in SEQ ID NO:40.
 35. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:41 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1014, 100), point B has coordinates (1014, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 36. The isolated nucleic acid of claim 35, wherein said nucleic acid sequence encodes a polypeptide.
 37. The isolated nucleic acid of claim 36, wherein said polypeptide has DDS activity.
 38. An isolated nucleic acid comprising a nucleic acid sequence, wherein said nucleic acid sequence encodes a polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:42 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (337, 100), point B has coordinates (337, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 39. The isolated nucleic acid of claim 38, wherein said polypeptide has DDS activity.
 40. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:95 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (2017, 100), point B has coordinates (2017, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 41. The isolated nucleic acid of claim 40, wherein said point B has coordinates (2017, 85).
 42. The isolated nucleic acid of claim 40, wherein said point C has coordinates (100, 65).
 43. The isolated nucleic acid of claim 40, wherein said point C has coordinates (50, 85).
 44. The isolated nucleic acid of claim 40, wherein said point D has coordinates (20, 100).
 45. The isolated nucleic acid of claim 40, wherein said nucleic acid sequence encodes a polypeptide.
 46. The isolated nucleic acid of claim 45, wherein said polypeptide has DXR activity.
 47. The isolated nucleic acid of claim 40, wherein said nucleic acid sequence is as set forth in SEQ ID NO:95.
 48. An isolated nucleic acid comprising a nucleic acid sequence having a length and a percent identity to the sequence set forth in SEQ ID NO:96 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (1161, 100), point B has coordinates (1161, 65), point C has coordinates (50, 65), and point D has coordinates (16, 100).
 49. The isolated nucleic acid of claim 48, wherein said nucleic acid sequence encodes a polypeptide.
 50. The isolated nucleic acid of claim 49, wherein said polypeptide has DXR activity.
 51. An isolated nucleic acid comprising a nucleic acid sequence, wherein said nucleic acid sequence encodes a polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:97 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (386, 100), point B has coordinates (386, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 52. The isolated nucleic acid of claim 51, wherein said polypeptide has DXR activity.
 53. An isolated nucleic acid comprising a nucleic acid sequence of at least 12 nucleotides, wherein said isolated nucleic acid hybridizes under hybridization conditions to the sense or antisense strand of a nucleic acid molecule, the sequence of said nucleic acid molecule being the sequence set forth in SEQ ID NO: 1, 2, 37, 38, 40, 41, 95, or
 96. 54. The isolated nucleic acid of claim 53, wherein said nucleic acid sequence is at least 50 nucleotides.
 55. The isolated nucleic acid of claim 53, wherein said nucleic acid sequence encodes a polypeptide.
 56. The isolated nucleic acid of claim 53, wherein said polypeptide has DXS, DDS, or DXR activity.
 57. A substantially pure polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:3 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (641, 100), point B has coordinates (641, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 58. The substantially pure polypeptide of claim 57, wherein said polypeptide has DXS activity.
 59. A substantially pure polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:39 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (333, 100), point B has coordinates (333, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 60. The substantially pure polypeptide of claim 59, wherein said polypeptide has DDS activity.
 61. A substantially pure polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:42 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (337, 100), point B has coordinates (337, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 62. The substantially pure polypeptide of claim 61, wherein said polypeptide has DDS activity.
 63. A substantially pure polypeptide comprising an amino acid sequence, wherein said amino acid sequence has a length and a percent identity to the sequence set forth in SEQ ID NO:97 over said length, wherein the point defined by said length and said percent identity is within the area defined by points A, B, C, and D of FIG. 26, wherein point A has coordinates (386, 100), point B has coordinates (386, 65), point C has coordinates (25, 65), and point D has coordinates (5, 100).
 64. The substantially pure polypeptide of claim 63, wherein said polypeptide has DXR activity.
 65. A host cell comprising an isolated nucleic acid of claim 1, 9, 12, 14, 22, 25, 27, 35, 38, 40, 48, 51, or
 53. 66. The host cell of claim 65, wherein said host cell is prokaryotic.
 67. The host cell of claim 65, wherein said host cell is selected from the group consisting of Rhodobacter, Sphingomonas, and Escherichia cells.
 68. The host cell of claim 65, wherein said host cell comprises an exogenous nucleic acid that encodes a polypeptide having DDS, DXS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase, 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, or chorismate lyase activity.
 69. The host cell of claim 65, wherein said host cell comprises an exogenous nucleic acid comprising an UbiC sequence or LytB sequence.
 70. The host cell of claim 65, wherein said host cell comprises an exogenous nucleic acid comprising an UbiC sequence and LytB sequence.
 71. The host cell of claim 65, wherein said host cell comprises non-functional crtE sequence, ppsR sequence, or ccoN sequence.
 72. The host cell of claim 65, wherein said host cell comprises non-functional crtE sequence, ppsR sequence, and ccoN sequence.
 73. A host cell comprising an exogenous nucleic acid and a non-functional crtE sequence, ppsR sequence, or ccoN sequence, wherein said exogenous nucleic acid is within a crtE, ppsR, or ccoN locus of said host cell.
 74. A host cell comprising a genomic deletion, wherein said deletion comprises at least a portion of a crtE sequence, ppsR sequence, or ccoN sequence, and wherein said host cell comprises a non-functional crtE sequence, ppsR sequence, or ccoN sequence.
 75. A method for increasing production of CoQ(10) in a cell having endogenous DDS activity, said method comprising inserting a nucleic acid molecule comprising a nucleic acid sequence that encodes a polypeptide having DDS activity into said cell such that production of CoQ(10) is increased.
 76. The method of claim 75, wherein said nucleic acid molecule comprises an isolated nucleic acid of claim 14, 22, 25, 27, 35, 38, or
 53. 77. The method of claim 75, wherein the production of CoQ(10) is increased at least about 5 percent as compared to a control cell lacking said inserted nucleic acid molecule.
 78. The method of claim 75, wherein said cell is selected from the group consisting of Rhodobacter and Sphingomonas cells.
 79. The method of claim 75, wherein said cell is a membraneous bacterium.
 80. The method of claim 75, wherein said cell is a highly membraneous bacterium.
 81. The method of claim 75, wherein said method further comprises inserting a second nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide having DXS activity into said cell.
 82. The method of claim 81, wherein said second nucleic acid molecule comprises an isolated nucleic acid of claim 1, 9, or
 12. 83. A method for increasing production of CoQ(10) in a cell having endogenous DDS activity, said method comprising inserting a nucleic acid molecule comprising a nucleic acid sequence that encodes a polypeptide having DXS activity into said cell such that production of CoQ(10) is increased.
 84. The method of claim 83, wherein the production of CoQ(10) is increased at least about 5 percent as compared to a control cell lacking said inserted nucleic acid molecule.
 85. The method of claim 83, wherein said cell is selected from the group consisting of Rhodobacter and Sphingomonas cells.
 86. The method of claim 83, wherein said nucleic acid molecule comprises an isolated nucleic acid of claim 1, 9, or
 12. 87. The method of claim 83, wherein said cell is a membraneous bacterium.
 88. The method of claim 83, wherein said cell is a highly membraneous bacterium.
 89. The method of claim 83, wherein said method further comprises inserting a second nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide having DDS activity into said cell.
 90. The method of claim 89, wherein said second nucleic acid molecule comprises an isolated nucleic acid of claim 14, 22, 25, 27, 35, 38, or
 53. 91. A method for increasing production of CoQ(10) in a membraneous bacterium, said method comprising inserting a nucleic acid molecule comprising a nucleic acid sequence that encodes a polypeptide having DDS activity into said bacterium such that production of CoQ(10) is increased.
 92. A method for increasing production of CoQ(10) in a highly membraneous bacterium, said method comprising inserting a nucleic acid molecule comprising a nucleic acid sequence that encodes a polypeptide having DDS activity into said highly membraneous bacterium such that production of CoQ(10) is increased.
 93. A method for making an isoprenoid, said method comprising culturing a cell under conditions wherein said cell produces said isoprenoid, said cell comprising at least one exogenous nucleic acid that encodes at least one polypeptide, wherein said cell produces more of said isoprenoid than a comparable cell lacking said at least one exogenous nucleic acid.
 94. The method of claim 93, wherein said cell is selected from the group consisting of Rhodobacter and Sphingomonas cells.
 95. The method of claim 93, wherein said isoprenoid is CoQ(10).
 96. The method of claim 93, wherein said at least one polypeptide has DDS, DXS, ODS, SDS, DXR, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase, 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, or chorismate lyase activity.
 97. The method of claim 93, wherein said at least one polypeptide is a UbiC polypeptide or a LytB polypeptide.
 98. The method of claim 93, wherein said cell comprises a non-functional crtE sequence, ppsR sequence, or ccoN sequence.
 99. The method of claim 93, wherein said cell comprises a non-functional crtE sequence, ppsR sequence, and ccoN sequence.
 100. The method of claim 93, wherein said cell comprising a genomic deletion, wherein said deletion comprises at least a portion of a crtE sequence, ppsR sequence, or ccoN sequence, and wherein said cell comprises a non-functional crtE sequence, ppsR sequence, or ccoN sequence.
 101. A method for making an isoprenoid, said method comprising culturing a genetically modified cell under conditions wherein said cell produces said isoprenoid.
 102. The method of claim 101, wherein said isoprenoid is CoQ(10).
 103. The method of claim 101, wherein said cell comprises an exogenous nucleic acid.
 104. The method of claim 101, wherein said cell comprises a genomic deletion. 